Skip to content

cilab-ufersa/thyroid_disease_AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thyroid Disease classification with machine learning approaches

This project is about prediction of Hypothyroidism and Hyperthyroidism using machine learning approaches. The data is obtained from the UCI Machine Learning Repository. The data is preprocessed and various machine learning algorithms are applied to predict the disease. The project is divided into 5 parts:

  1. Data Preprocessing
  2. Exploratory Data Analysis
  3. Model Building
  4. Model Evaluation
  5. Model explanation

The data is preprocessed by removing missing values, encoding categorical variables, and scaling the data. The Exploratory Data Analysis is done to understand the data and the relationship between the features. The model is built using various machine learning algorithms such as Logistic Regression, Random Forest, Gradient Boosting, etc. The model is evaluated using various metrics such as precision, recall, f1-score, and accuracy. The results are compared and the best model is selected.

The project is implemented in Python using Jupyter Notebook. The libraries used are pandas, numpy, matplotlib, seaborn, scikit-learn, and xgboost.

Table of Contents

  1. Installation
  2. Prerequisites
  3. Requirements

Installation

$ git clone

Prerequisites

What things you need to have to be able to run:

  • Python 3.6 +
  • Pip 3+
  • VirtualEnvWrapper is recommended but not mandatory

Requirements

$ pip install requirements.txt

Explanable AI

This project uses SHAP for explainable AI. SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.

Publications related to this project

Cavalcante, C. M. V., and Rosana C. B. Rego. "Early prediction of hypothyroidism based on feature selection and explainable artificial intelligence." In: Simpósio Brasileiro de Computação Aplicada à Saúde (SBCAS), 2024, Goiânia. Anais do XXIV Simpósio Brasileiro de Computação Aplicada à Saúde, 2024. pp. 49-60.

Cavalcante, C. M. V., and Rosana C. B. Rego. "Explainable AI Diagnosis for Hypothyroidism." In: 21st IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2024, Natal, Brazil. .

Software Developed

HypoAssist: A software for early prediction of hypothyroidism based on feature selection and explainable artificial intelligence. The software is available at HypoAssist© : Diagnostic Assistant for Hypothyroidism.

Support by UFERSA Edital PROPPG 65/2022 (PAPC)

Financial support in granting a Scientific Initiation scholarship and UFERSA/PROPPG 65/2022 (PAPC)

About

Thyroid Disease prediction with machine learning approaches 🧬

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages