Projects - EDA
These projects showcase my practical skills working with SQL, Python, Excel, R, Power BI, Tableau, AWS, GCP & Microsoft Azure. Where appropriate there are links provided to my code on GitHub & downloadable files for reports I have created.
Endangered Species EDA
- Utilised SQL to extract data from 10 different related tables from CITES Trade Database using JOIN and VIEW
- Transformed and filtered data in SQL by using aggregating & filtering function to improve reporting process
- Loaded and visualised data with Python identifying key trends & objectives for the conservation of endangered animals
- Developed interactive dashboards in Power BI that provided insights to support decision-making in conservation planning
Pharmaceutical Price Analysis Around the World
- Extracted pharmaceutical pricing data across multiple
global datasets using SQL (JOIN, FILTER, AGGREGATE)
- Cleaned and standardised inconsistent pricing records to improve comparability
- Built interactive dashboards in Power BI to visualise cost disparities across
countries and support healthcare policy discussions
Amazon Product Web Scraper & EDA
- Scraped & ingested product data using Python (BeautifulSoup) & cleaned data was centralised in Azure Synapse Analytics
- Stored & managed the large datasets in Azure Synapse Analytics building ETL pipelines to clean, aggregate & standardise data
- Enabled SQL queries through Microsoft Azure & connected Azure Synapse Analytics to Power BI for creating dashboards
- Interactive dashboards were created in Power BI as well as further exploratory data analysis in Python & RStudio
Data Professional Survey Dashboard
- Retrieved raw survey data & imported into Power BI for ETL
- Cleaned the data involved standardisation of job title names, handling missing values, outlier & filtering inconsistent entries
- Data was restructured via Power Query where columns were split, merging & categorical variables were reformatted
- Meaningful DAX measures were built and a dynamic dashboard was developed where stakeholders could explore by role, salary band & geography
Lego Models EDA
- Collected and processed LEGO datasets
using SQL queries and Python scripts
- Analysed historical release trends, popular themes, and
part diversity to uncover consumer and design patterns.
- Applied DAX measures and data modelling to track correlations
between number of pieces, retail pricing, and release year.
- Developed interactive dashboards in Power BI to identify KPIs and provide data-driven
insight