Machine Learning for Alzheimer's Brain Classification

Timeline
March 2024 - May 2024
Skills
MATLAB, Data Visualization
Machine Learning, Statistical Analysis, Biomedical Data Processing
Overview
In this project, I applied computational and machine learning methods to analyze large-scale biomedical datasets related to Alzheimer's Disease. My goal was to explore the relationship between the APOE gene and structural brain changes, and to evaluate whether machine learning models could classify patients from MRI data with high accuracy.
Process
Data Engineering
I integrated diverse datasets (GWAS, gene expression, transcriptomics, and MRI) into standardized formats (.csv, .mat) using MATLAB. This required extensive data cleaning, feature extraction, and merging across biomedical data sources.
Statistical & Computational Analysis
I implemented statistical testing (t-tests, regression models) to measure significant differences in APOE gene expression across brain regions. Linear regression revealed strong correlations between APOE expression and gray-matter atrophy, providing evidence of APOE's role in Alzheimer's progression.
Machine Learning
I trained two machine learning models on MRI gray-matter volume features. A ridge-regularized linear model achieved 97.5% accuracy on test data, demonstrating strong generalization. I also developed a neural network, which reached 84.0% accuracy after parameter tuning, though it showed some overfitting. These results highlight the potential of machine learning to distinguish Alzheimer's patients from controls based solely on structural brain data.
Next Steps
To build on this project, in the future I could refine model performance by addressing overfitting, experiment with additional algorithms, and deploy the pipeline in an interface where users can upload imaging data and receive predictions in real time. This would transform the analysis from an academic study into a functional, accessible software tool.