Skip to content

KritiCParikh/Cardiovascular-Risk-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cardiovascular-Risk-Prediction

Project Overview

This project aims to predict the 10-year risk of future coronary heart disease (CHD) in patients using machine learning techniques. By analyzing various health indicators and lifestyle factors, we develop a model that can assist healthcare professionals in identifying high-risk individuals and implementing preventive measures.

Skills and Technologies:

  • Python

  • Data Analysis: Using Python libraries such as Pandas and NumPy for data manipulation and analysis.

  • Data Visualization: Employing Matplotlib and Seaborn for creating insightful visualizations of the dataset.

  • Machine Learning: Implementing various classification algorithms including Logistic Regression, K-Nearest Neighbors, Decision Trees, and Random Forests using Scikit-learn.

  • Feature Engineering: Performing data preprocessing tasks such as handling missing values, encoding categorical variables, and scaling numerical features.

  • Model Evaluation: Utilizing cross-validation techniques and various performance metrics (accuracy, F1-score, ROC-AUC) to assess model performance.

  • Hyperparameter Tuning: Applying Grid Search to optimize model parameters for improved performance.

  • Statistical Analysis: Using libraries like StatsModels for in-depth statistical modeling.

Business Impact

  1. Early Risk Identification: By accurately predicting the 10-year risk of coronary heart disease, healthcare providers can identify high-risk patients early, enabling timely interventions and preventive care.
  2. Resource Allocation: Healthcare systems can use these predictions to allocate resources more efficiently, focusing on patients with higher risk profiles.
  3. Personalized Care Plans: The model's insights can help in developing personalized treatment and lifestyle modification plans for patients based on their risk factors.
  4. Cost Reduction: Early intervention and prevention strategies based on accurate risk prediction can potentially reduce the long-term healthcare costs associated with treating advanced heart disease.

Dataset

The dataset used in this project contains information about patients, with 17 features for each individual. These features include:

  • id: Unique identifier for each patient
  • age: Age of the patient
  • education: Education level (1-4)
  • sex: Gender (M/F)
  • is_smoking: Smoking status (YES/NO)
  • cigsPerDay: Number of cigarettes smoked per day
  • BPMeds: Blood pressure medication usage (0/1)
  • prevalentStroke: History of stroke (0/1)
  • prevalentHyp: Presence of hypertension (0/1)
  • diabetes: Presence of diabetes (0/1)
  • totChol: Total cholesterol level
  • sysBP: Systolic blood pressure
  • diaBP: Diastolic blood pressure
  • BMI: Body Mass Index
  • heartRate: Heart rate
  • glucose: Glucose level
  • TenYearCHD: 10-year risk of coronary heart disease (target variable, 0/1)

This comprehensive dataset includes demographic information, lifestyle factors, medical history, and various health measurements, providing a solid foundation for developing an accurate predictive model.

Key Features of the Project

  • Exploratory Data Analysis (EDA): Thorough analysis to uncover patterns and relationships in the dataset.
  • Feature Engineering: Advanced techniques to capture complex interactions between health indicators.
  • Machine Learning Models: Implementation of various models to predict CHD risk.
  • Model Evaluation and Interpretation: Providing insights into the most important risk factors and the model’s performance.

Thank You. Let’s keep learning and growing together!            

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published