A hand holding a pen, about to write something in a project planner.

Classification with NN and Tree-based Machine Learning Models: Wine Quality Dataset

In this project, I did classification of wine quality using several Tree-based Machine Learning models and simple Neural Network model:

  • Gradient Boosting Machines (GBM)
  • XGBoost
  • RandomForest
  • Neural Network

The key steps included:

  • Data Preparation and Preprocessing:
    • Data Exploration: To check the types of data within the dataset, identify missing values, and detect duplicates, to ensure the dataset is clean for analysis..
    • Feature Selection After an exploratory data analysis, I selected features based on their correlation with wine quality and domain knowledge.
    • Data Transformation Applied log transformation to certain skewed features to normalize their distribution, improving model performance.
  • Model Training and Evaluation: Tuned with GridSearchCV to find the best parameters.
  • Results and Analysis: On the one hand, RandomForest performed the best compared to other tree-based machine learning methods (XGBoost, Gradient Boosting Machines (GBM)), demonstrating its robustness and effectiveness in handling variability. On the other hand, the Neural Network model with a simple architecture showed superior performance.
    However, the choice of which model to use depends on the importance of the model's interpretability, deployment requirements, and operational considerations.

Technologies used

  • Python
  • NN
  • Tree-based ML
View the code.
Projects