

Data Science Projects
This section showcases end-to-end projects in Data Science (DS), Data Analysis (DA), and Computer Science, spanning domains including aviation safety, finance, and full-stack development. Projects are hosted on their respective platforms — GitHub, Hugging Face (HF), and Kaggle — where notebooks, datasets, and live demos are available in full.

Project: Aviation Human Factor Incidents and Preventive Measures
Narrative: NASA ASRS (Aviation Safety Reporting System), ~111,000 incident reports (2005–2025), filtered to ~40,000 human-factor incidents. The dataset was collected, cleaned, and preprocessed by the author. An end-to-end ML (Machine Learning) pipeline was built using sentence embeddings (MiniLM) combined with structured features, feeding XGBoost (eXtreme Gradient Boosting) multi-label classifiers across four target groups and 69 labels. A preventive recommendations layer sits on top of model outputs. Deployed as a live Gradio app on Hugging Face Spaces.

Project: Multi-Task NLP for Aviation Incident Risk Estimation
Narrative: ~38,000 aviation safety incident reports from NASA's ASRS, covering January 2012 to March 2022, including structured metadata and unstructured narrative descriptions. The project applies NLP (Natural Language Processing) techniques to classify incident risk factors from free-text reports, combining text embeddings with structured tabular features. Deployed as a live Gradio app on Hugging Face Spaces.

Project: Finance Prediction
Narrative: Daily OHLCV (Open, High, Low, Close, Volume) stock data for major aircraft manufacturers and airlines (2015–2025), sourced from Yahoo Finance. Applies time-series analysis and regression models to explore price prediction in the aviation financial sector.

Project: Netflix Movies Rating Prediction
Narrative: An independent regression project analysing Netflix movie metadata (2010–2025). Evaluate whether engineered metadata features can be used to predict movie ratings using regression models.

Project: Safety in Aviation Industry — EDA (Exploratory Data Analysis)
Narrative: A Data Analytics project exploring historical aviation accident data (1908–2023). Merging two datasets, and addresses the questions: Is it safer to fly today than in the past? Will it be even safer in the future? Includes trend analysis, fatality rate modelling, and Power BI (Business Intelligence) dashboard output.

Project: Movies & Shows Suggester Website (Harvard CS50 Final Project)
Narrative: A Flask web application that queries two movie databases (shows.db, movies.db) to recommend the best movies and shows from Netflix. Integrates an OpenAI API (Application Programming Interface) for natural language queries such as "best shows in 2021" or "movies with actors X, Y, Z."

Project: ASRS Dataset Creation
Narrative: Documents the full pipeline used to collect and prepare the NASA ASRS aviation safety incident dataset — downloaded in batches of up to 5,000 records per query, cleaned, and published as a standalone open dataset on Hugging Face and Kaggle.

Project: Currency Transfer Tracker (Harvard CS50SQL Final Project)
Narrative: A relational database designed to track personal financial movements — investments, transfers, and expenditures — built as the final project of the Harvard CS50SQL course. Demonstrates schema design, normalization, and query writing.

Project: Space Collision Game (Harvard CS50P Final Project)
Narrative: A real-time terminal game built in Python as the final project of the Harvard CS50P course. The player controls a meteor using keyboard inputs to avoid collisions with randomly generated planets. Demonstrates Python fundamentals, loop control, and library integration (blessed, random).