This repository contains the code and resources for a machine learning project aimed at diabetes detection. We have implemented multiple machine learning models and data visualization techniques to build an accurate diabetes detection system.
We employed the following machine learning models for our project:
- Decision Tree
- Logistic Regression
- Support Vector Machine
- Random Forest
- XGBoost
In addition to the models, we used various visualization techniques to gain insights from the data and evaluate model performance. These visualizations include:
- Histograms: Used to visualize the distribution of key features in the dataset.
- Correlation Matrix Graph: Helps in understanding the relationships between different features and their impact on the target variable.
- ROC Curve: A graphical representation of the Receiver Operating Characteristic to evaluate the model's performance.
After thorough experimentation and evaluation, we found that the Decision Tree model performed exceptionally well for diabetes detection, achieving an accuracy of 78%. This result demonstrates the potential of machine learning in accurately identifying diabetes cases.
diabetes.csv/
: Contains the dataset used for training and testing.Diabetes.ipynb/
: Jupyter notebooks with the code for the project, including data preprocessing, model training, and visualization.README.md
: This document, providing an overview of the project.
To get started with this project, follow these steps:
- Clone this repository to your local machine.
- Navigate to the
notebooks/
directory and open the Jupyter notebooks to explore the code. - Ensure you have the required libraries and dependencies installed. You can do this using
pip install -r requirements.txt
. - Run the notebooks to train and evaluate the machine learning models and visualize the results.
To run the code in this repository, you will need to have the following Python libraries and packages installed:
- numpy
- pandas
- scikit-learn
- matplotlib
- seaborn
- xgboost
You can install these dependencies using pip
with the provided requirements.txt
file.
pip install -r requirements.txt
This project was developed with the support of various open-source libraries and resources. We would like to thank the community for their contributions and the dataset providers for making their data available for research and analysis.
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to explore, contribute, or use the code in your own projects. If you have any questions or suggestions, please open an issue or reach out to the project maintainers. We welcome your feedback and collaboration!