Predict Heart Health with Hybrid Machine Learning Models
Heart disease detection using hybrid machine learning and ensemble learning models enables early diagnosis and treatment, potentially reducing mortality rates associated with cardiovascular diseases (CVD). This project demonstrates the power of data science in making healthcare more accessible and efficient. 🌍💡
If you're interested in the research aspect of this project, explore my detailed paper:
📄 Read the Research Paper Here
This project involves building and evaluating machine learning models to predict heart disease based on the heart.csv
dataset. The tool is deployed online, so anyone can access it for predictions!
🚀 Live Demo: Heart Disease Predictor
⚠ Note: The site might take a few seconds to load due to server traffic.
- Overview
- Dataset Information
- Installation
- Usage
- Models Evaluated
- Results
- Visualization
- Contributing
- License
🔗 Watch the Project Walkthrough
The goal of this project is to predict the presence of heart disease in patients using machine learning models trained on medical attributes. By leveraging ensemble learning, we improve accuracy and reliability.
🛠 Key Features:
- Hybrid machine learning models for better accuracy.
- User-friendly web-based tool for remote accessibility.
- Rich data visualizations for insights.
- Dataset Used:
heart.csv
- Attributes:
- Age, Sex, Chest Pain Type, Resting BP, Serum Cholesterol, Fasting Blood Sugar, Resting ECG, Max HR, Exercise-induced Angina, Oldpeak, Slope, Number of Vessels, and Thal.
- Source:
The dataset contains the following columns:
age
: Age of the patientsex
: 1 = male, 0 = femalecp
: Chest pain type (1: typical angina, 2: atypical angina, 3: non-anginal pain, 4: asymptomatic)trestbps
: Resting blood pressurechol
: Serum cholesterol in mg/dlfbs
: Fasting blood sugar > 120 mg/dlrestecg
: Resting electrocardiographic results (0, 1, 2)thalach
: Maximum heart rate achievedexang
: Exercise induced anginaoldpeak
: ST depression induced by exercise relative to restslope
: The slope of the peak exercise ST segmentca
: Number of major vessels (0-3) colored by fluoroscopythal
: 3 = normal, 6 = fixed defect, 7 = reversible defecttarget
: 1 = presence of heart disease, 0 = absence of heart disease
I have included requirements and dependencies files.
Clone the repository: bash
git clone https://github.com/erenyeager101/Heart-Disease-monitoring.git cd heart-disease-prediction Ensure you have all dependencies installed.
Run the main script:
bash
python main.py
The following models are evaluated in this project:
Logistic Regression Naive Bayes Support Vector Machine (SVM) K-Nearest Neighbors (KNN) Decision Tree Random Forest Neural Network Results
base estimators =[ random forest , descision , knn] The accuracy scores of the models are as follows:
Logistic Regression: 85.25% Naive Bayes: 85.25% Support Vector Machine: 81.97% K-Nearest Neighbors: 67.21% Decision Tree: 81.97% Random Forest: 88.76% Neural Network: 85.25% Stacking Classifier: 90.16%
The project includes a bar plot that compares the accuracy scores of different models.
import matplotlib.pyplot as plt import seaborn as sns
#Example code to plot the accuracy scores plt.figure(figsize=(15, 8)) sns.barplot(x=algorithms, y=scores) plt.xlabel("Algorithms") plt.ylabel("Accuracy score") plt.show() Saving and Loading Models The best model (Random Forest in this case, cause the accuracy we fetched using this model was highest in comparison to other algorithms )is saved using pickle for future use.
import pickle
with open('model_randomforestversion2.pkl', 'wb') as f: pickle.dump(rf, f)
Contributions are welcome! Please create a pull request or raise an issue to discuss your ideas.
This project is licensed under the MIT License - see the LICENSE file for details.