AI/ML Projects


Generation of realistic 2D scenes by Text-to-2D Models

Description of the image

Click to view the Research Report

Source codes available

This project aimed to improve a text-to-image model AttnGAN in terms of textual understanding and training efficiency.

  • Developed a architecture called Trans_AttnGAN. Employed a pre-trained BERT as the text encoder for generating more contextually accurate sentences and word embeddings.
  • Designed a Soft Alignment Loss, leveraging a pre-trained image captioning BLIP followed by a BERT to generate fine-grained guidance in sentence and word level.
  • Verified that Trans_AttnGAN achieved comparable performance to AttnGAN with roughly half of the total training time on the CUB-200 dataset.


Music Genre Classifier Paper

Click to view the paper

Source codes available

Trained and fine-tuned 9 different machine learning models for music genre classification, which are SVM, Logistic Regression, KNN, Naive Bayes, QDA, Random Forest, MLP, CatBoost, XGBoost.

  • Group leader
  • Use the FMA music dataset containing metadata, features, and genres for over 100,000 tracks.
  • Preprocess data by cleaning, word2vec, filling missing values, and normalizing.
  • Feature selection with Chi-square and Dimensionality reduction with PCA.
  • 5 fold cross-validation to fine tune the 9 models.
  • Evaluate the best 9 models on test set using accuracy and macro F1 score.


Movie Review Sentiment Classifier

Click to view the report

Source codes available

Build a RNN-based model using Pytorch for sentiment classification on the movie reviews.

  • Preprocess the "Large Movie Review Dataset", including tokenization using SpaCy, construction of the one-hot vocabulary.
  • Modified architeture: Embedding layer + Bi-directional 2-layer LSTM + Linear layer.


CNNs for Handwritten Digit Classification

Click to view the report

Source codes available

The goal of this project was to build and modify two CNN models using Keras and PyTorch to classify handwritten digits.

  • Implemented baseline CNN models on MNIST dataset for digit classification
  • Tuned model architectures by varying number of layers, kernel sizes, and nodes


ResNet and Gradient Vanishing

Click to view the report

Source codes available

This project investigated the gradient vanishing problem and ResNet in Tenserflow and Keras.

  • Demonstrated the gradient vanishing issue in a feedforward network with tanh activation and solved the problem using ResNet.
  • Identified the maximum layers for ffnet (21 layers) and ResNet (63 layers) with tanh activation before gradient vanishing emerges.
  • Showed that switching the activation function to ReLU in ffnet and ResNet further increases their resilience to gradient vanishing (22 layers for ffnet, 81 for ResNet).


Analysis of MLP and CNN in Pattern Recognition

Click to view the report

This project analyzed the performance and parameter settings of MLP and CNN on MNIST and CIFAR 10 dataset in Matlab.

  • Experimented with various MLP structures and parameters on the MNIST dataset.
  • Implemented CNNs on MNIST and CIFAR 10 datasets, adjusting channels and pooling methods.
  • Fine-tuned to get the best CNN on MNIST which gained 0.968 and 0.9 for traning and testing accuracy, respectively.


A Model of Residual Sugar Content and Volatile Acidity Content in Wine Using MLE

Click to view the paper

Apply maximum likelihood estimation to model the relationship between residual sugar content and volatile acidity content in wine in Python.

  • Based on the chemical equation: C6H12O6+2O2 = 2CH3COOH + 2CO2 +2H2O, the hypothesis that there is a linear relationship between volatile acidity and residual sugar is made, i.e., residual sugar (y) and volatile acidity (x) can be represented by a linear function y = a + bx, where a and b are coefficients
  • According to MLE, mathematically derive the expressions for the parameters a and b.
  • Use polyfit function to calculate the coefficients, visualize the results and find out linear relationship is not efficient in describing the relationship between residual sugar and volatile acidity.
  • Finally, extend the experiment to quadratic and cubic, compare the performance of the three models and find out the cubic model performs the best.


A Comparative Analysis of Machine Learning Techniques for Phishing URL Detection

Click to view the paper

Compare the two models for phishing URL detection, namely Phishing Websites Classification using Hybrid SVM and PDRCNN: Precise Phishing Detection with Recurrent Convolutional Neural Networks.

  • Group leader
  • Compare the two models in terms of their dataset, feature extraction process, model architecture, and performance.
  • According to the experiment, both models showed good performance, but PDRCNN was more accurate on a larger real-world dataset with 97% accuracy compared to SVM with 95.8%.
  • Based on the performance, propose potential improvements to the two models. For example, for the SVM model, SVM + KNN ensembling could be applied to improve efficiency and accuracy, while for the PDRCNN, additional features like URL-based, page-based, and content-based features can be incorporated.


Gradient Descent based Time Series Prediction Algorithm

Click to see the technical report

Source codes available

This project aims to use gradient descent to predict a data series in Python.

  • Implement gradient descent to optimize prediction model parameters
  • Analyze convergence behavior by plotting cost over iterations
  • Investigate effect of learning rate on convergence properties
  • Test the algorithm on multiple datasets for verification


Feature Selection using Chi-Square Testing for Phishing URL Detection

Click to see the report

This project identified the most important features using chi-square feature selection on a dataset of phishing and legitimate URLs in Python.

  • Find out the top features identified were predominantly URL-based, Domain-based and popularity features also contributed significantly, while content features had less impact.


A Comparative Experiment between Neural Network and Random Forest

Click to see the report

This project compared the performance of random forest and neural networks on classification tasks in terms of training and testing accuracy, model complexities and differences in modeling aspects in Python.

  • Random forest achieved comparable accuracy to neural networks while requiring significantly less training and testing time, making it preferable for lightweight tasks
  • Hyperparameter tuning was simpler for random forest than neural networks, demonstrating its advantage in ease of optimization over more flexible deep models.


Importance of Feature Normalization for Different ML Models

Click to see the report

This project investigated the impact of feature normalization on the performance of various machine learning models including KNN, Logistic Regression, Decision Tree, Random Forest and SVM in Python.

  • Find out that Distance-based and Margin-based models like KNN and Logistic Regression directly benefit from normalization while Condition-based model like decision trees are not sensitive to it


Comparative Study of SVM and Neural Networks in Diabetes Classification

Click to see the report

This project compared the performance of support vector machines and neural networks for classification of the Pima Indians Diabetes dataset using Microsoft Azure Machine Learning Studio.

  • Implement SVM and neural network models in Azure ML for binary classification
  • Evaluate model performance on test data through various metrics like accuracy, AUC etc
  • Explore parameter tuning and different model configurations