πΊ Hypertension Risk Analysis & Prediction
This is a miniature project on analyzing the blood pressure metrics , cholesterol and other health factors of over 180,000 people and predicting hypertension in individuals.
It features:
* A full preprocessing pipeline built in Google Colab
* Statistical and Machine learning models trained on a 180K+ health dataset
* A Streamlit app to estimate individual hypertension risk scores
π Project Structure:
HypertensionRiskProject/ β βββ notebooks/ β βββ preprocessing_and_modelling.ipynb β βββ app/ β βββ hypertension_app.py β βββ source_data/ β βββ health_dataset.csv β βββ README.md βββ requirements.txt
---
π Features Used :
These ae the various health related attributes of people from the dataset
* Age (years)
* BMI (kg/m^2)
* Systolic & Diastolic BP (mmHg)
* Cholesterol (mg/dL)
* Glucose (mg/dL)
* LDL, HDL, Triglycerides (mg/dL)
* Heart Rate (bpm)
* Sleep Duration (hrs)
* Stress Level, Salt Intake (scaled 0-10)
---
Running the Streamlit App:
# Install dependencies
pip install -r requirements.txt
# Run the Streamlit application
streamlit run app/hypertension_app.py
App link ! --> https://hypertensionriskprediction-8fn7uzqs9ucy3urb953t3l.streamlit.app/
---
π§ Models Used :
In order to conclude the strongest correlations between the attributes of the dataset , and to conclude the attributes contributing the most to hypertension,
few statistical and machine learning models were used , which are:
* Logistic Regression
* Random Forest
* XGBoost
* Naive Bayes
* Linear SVC
* Combined model (averaged feature importance)
---
π¬ Notebooks
The preprocessing notebook includes:
* Data cleaning & null handling
* Label encoding & scaling
* Correlation heatmaps & feature plots
* Model training & evaluation
* Feature importance extraction
> [Colab Notebook Link](https://colab.research.google.com/drive/1N6xGsqU0bZb5v3SlhMRRwR3tcpxKkkN9?usp=sharing)
---
π Requirements
All dependencies are listed in `requirements.txt`, including:
* pandas
* scikit-learn
* xgboost
* streamlit
* matplotlib
* seaborn
* scipy ( for pearson's correlation test)
* Numpy
---
π― Future Improvements
* Add severity scoring with category breakdown
* Personalized health & lifestyle suggestions
* Downloadable PDF reports from app
* Streamlit Cloud deployment
---
π€ Author ?
Built with β€οΈ by Samuel Ragland
THANKS!