About Me

I am a mathematics and biomedical engineering graduate with a passion for forecasting, transforming data and providing data-driven solutions which have a measurable impact.

This portfolio aims to provide an overview of the projects I have created to showcase implementation of my skills developed as well as my experiance with ETL & utilisation of cloud services such as GCP, AWS & Microsoft Azure.

I continually update this site and put on relevant projects so advice on how to improve & further develop my skills for data analysis, machine learning & engineering is always welcome. Feel free to reach out on any of the contact details that I have provided.

Projects - EDA

These projects showcase my practical skills working with SQL, Python, Excel, R, Power BI, Tableau, AWS, GCP & Microsoft Azure. Where appropriate there are links provided to my code on GitHub & downloadable files for reports I have created.

Endangered Species EDA

  • Utilised SQL to extract data from 10 different related tables from CITES Trade Database using JOIN and VIEW
  • Transformed and filtered data in SQL by using aggregating & filtering function to improve reporting process
  • Loaded and visualised data with Python identifying key trends & objectives for the conservation of endangered animals
  • Developed interactive dashboards in Power BI that provided insights to support decision-making in conservation planning

Pharmaceutical Price Analysis Around the World

  • Extracted pharmaceutical pricing data across multiple global datasets using SQL (JOIN, FILTER, AGGREGATE)
  • Cleaned and standardised inconsistent pricing records to improve comparability
  • Built interactive dashboards in Power BI to visualise cost disparities across countries and support healthcare policy discussions

Amazon Product Web Scraper & EDA

  • Scraped & ingested product data using Python (BeautifulSoup) & cleaned data was centralised in Azure Synapse Analytics
  • Stored & managed the large datasets in Azure Synapse Analytics building ETL pipelines to clean, aggregate & standardise data
  • Enabled SQL queries through Microsoft Azure & connected Azure Synapse Analytics to Power BI for creating dashboards
  • Interactive dashboards were created in Power BI as well as further exploratory data analysis in Python & RStudio

Data Professional Survey Dashboard

  • Retrieved raw survey data & imported into Power BI for ETL
  • Cleaned the data involved standardisation of job title names, handling missing values, outlier & filtering inconsistent entries
  • Data was restructured via Power Query where columns were split, merging & categorical variables were reformatted
  • Meaningful DAX measures were built and a dynamic dashboard was developed where stakeholders could explore by role, salary band & geography

Lego Models EDA

  • Collected and processed LEGO datasets using SQL queries and Python scripts
  • Analysed historical release trends, popular themes, and part diversity to uncover consumer and design patterns.
  • Applied DAX measures and data modelling to track correlations between number of pieces, retail pricing, and release year.
  • Developed interactive dashboards in Power BI to identify KPIs and provide data-driven insight

Projects - Deep / Machine Learning

SpaceX Flight Landing Prediction

  • Collected launch data by scraping SpaceX launch records from Wikipedia with BeautifulSoup and integrating the SpaceX REST API
  • Performed ETL, cleaned the data & developed an interactive geospatial maps of launch sites with Folium
  • Various models trained such as logistic regression, decision trees, SVM & KNN
  • Developed an interactive dashboard in Plotly Dash that displays launch‑success statistics, payload/booster correlations and allows filtering by launch site
  • Leveraged Python/Jupyter along with Pandas, NumPy, BeautifulSoup, Requests, Scikit‑learn, Plotly, Folium and Dash across the project

Mathematics for Machine Learning Project

  • Examined vector spaces, orthogonality, projections and inner products & angles
  • Developed notebooks which utilised inear algebra, machine learning algorithms and neural network theory
  • Implemented algorithms such as K‑Nearest Neighbours (KNN) and PageRank to translate theoretical insights into ML code
  • Practised gradient-based optimisation and backpropagation on problems involving fitting probability distributions and modelling helices with neural networks.

Deep Learning Project

  • Explored a range of deep‑learning architectures through experiments with Convolutional Neural Networks (CNNs), Vision Transformers (ViTs) for a wast classification problem
  • Built and compared CNN classifiers in Keras and PyTorch, including a custom PyTorch model trained from scratch with comprehensive training/validation monitoring
  • Implemented Vision Transformers in both frameworks, experimented with hybrid CNN‑ViT architectures and analysed their training efficiency relative to CNNs
  • Developed predictors for breast‑cancer classification, investigating memory‑based versus generator‑based data pipelines and using cross‑validation for evaluation
  • Applied transfer learning and fine‑tuning, with model selection and hyper‑parameter tuning for waste image classification

Generative Pre-Trained Transformer (GPT)

  • Developed a GPT utilising self-attention, layer norms, embeddings & residuals using PyTorch in AWS SageMaker & Docker replicating the 124-million-parameter GPT-2 model
  • Data pipelines produced with tiktoken; used multiprocessing developed for tokenisation, sharding & ingesting datasets
  • Tuned models using cross-entropy loss & optimised with cosine learning-rate scheduling, dropout & mixed precision acceleration

Real-Time Facial Muscle Activation Detector for Locked-In Syndrome

  • Signal processing techniques such as Chebyshev, Butterworth filters & wavelet transforms performed in MATLAB & Python
  • Trained a binary classifier in AWS SageMaker using scikit-learn & XGBoost, achieving 92.3% accuracy, 100% precision & 85.71% recall on unseen data
  • Designed end-to-end ML pipeline: data ingestion, preprocessing, model training, real-time prediction & reproducible via Docker

Churn Model

Computer Vision

Churn Model

Projects - Mathematical Modelling & Inference

PK/PD Modelling to Assess The Effects of Anti-Cancer Agents on Tumour Volume

  • In collaboration with GlaxoSmithKline, MATLAB, Python & Excel was utilised to forecast an optimal dosing strategy
  • Dataset was transformed, cleaned & standardised performing missing value imputation, outlier detection & normalisation
  • Model parameters were validated, fitted, visualised & then results were reported to stakeholders
  • Followed compliance and carried out risk assessments for working with sensitive data
  • A novel strategy was produced for combined therapies, and the model was stress tested

Simulation-Based Mathematical Models & Parameter Estimation to Interpret Tissue Growth Experiments

  • Studied cell migration in tissues using stochastic lattice‑based simulations, PDE continuum approximations and discrete agent‑based random walks
  • Compared measurement‑error models by fitting simulation outputs to experimental‑style data, evaluating additive Gaussian versus multinomial error formulations for single & multiple subpopulation scenarios
  • Provided synthetic data and scripts for parameter estimation, identifiability studies and prediction interval calculations
  • Heavy reliance on numerical methods, statistical modelling PDE solving & inference

StochLab: Interactive Stochastic Modelling Dashboard

  • Developed a full-stack interactive dashboard utilising Python & REST API to simulate stochastic processes & financial models
  • Developed numerical methods in Python for on-demand parameter exploration & statistical modelling within the dashboard
  • Built Monte Carlo, Gillespie, Brownian models in Python, deployed on GitHub & used Docker containers for reproducibility

Work Experience

Quantitative Biology Research Placement – QUT, Brisbane, Australia

Feb 2025 – Jun 2025
  • Performed mathematical modelling for tissue growths with a specialisation in stochastic processes, numerical methods, applied statistics, model validation & stress testing
  • Utilised Python, RStudio, Julia & MATLAB for producing synthetic data, forecasting & validation
  • Collaborated with data scientists, machine learning engineers & biologists on cellular systems research

Administrator – University of Surrey, Guildford, UK

Oct 2024 – Jan 2025
  • Utilised CRMs (Dynamics365) for forecasting, data management, reporting & dashboard development enabling efficient case management & improved service delivery
  • Updated compliance documentation & identified KPIs improving operation efficiency
  • Transformed, filtered & analysed sensitive data in Microsoft Excel
  • Cross-functional collaboration facilitated data-driven insight across teams & a reduction in student wait times

Education

Master of Science in Biomedical Engineering – University of Warwick

Oct 2023 – Oct 2024

Bachelor of Science (Hons) in Mathematics with Economics – Loughborough University

Oct 2020 – Jul 2023