Wine Recommendation Analytics with Airflow, BigQuery,Tableau and Custom API
In this project, custom API was developed alongside a two-stage data pipeline using Apache Airflow to automate data processing and management tasks for wine data analytics. Also, Tableau was utilized for visualization:
- First stage: Created a PostgreSQL database containing wine-related data (wineries, wines, ratings, and harmonizers) and developed custom API endpoints for accessing the data using Flask. .
- Second stage: Automated the process of cleaning and organizing data from a custom API, which involved tasks like renaming columns and merging datasets into tables for wines and related information, using Pandas and Apache Airflow's Dag.
- Third stage: Automated the data transfer process to Google BigQuery and implemented advanced data management strategies, including the use of clustered tables to enhance query performance for wine types and countries of origin. Also,
- Fourht stage: To visualize the data and create an analytical overview, two dashboards were created. These dashboards are connected directly to our BigQuery data, allowing for real-time data exploration and interaction.
Technologies used
- Flask
- Docker
- Apache Airflow
- Google Cloud - BiqQuery
- Tableau