Skip to content

Mukunda1196/PopHealth-Predictive-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

PopHealth-Predictive-Models

This project used population health data from University of Wisconsin's Population Health Institute. The goal was to analyze the data and train supervised learning models to identify what factors may increase the risk of premature death. Both clustering and classification techniques are used.

Data

The data file is available at https://www.countyhealthrankings.org/sites/default/files/media/document/analytic_data2022.csv . Documentation of the measures in the data file is in https://www.countyhealthrankings.org/sites/default/files/media/document/2022%20Analytic%20Documentation.pdf

Tools Used

  • EDA
  • Identifying and dealing with missing and null values
  • Identifying highly correlated variables
  • Performing common sense feature removal
  • Identifying and dealing with outliers
  • Normalizing Data
  • K-means clustering with the silhoutte method for choosing the number of clusters
  • Linear Regression w/ feature importance extraction
  • Decision Tree w/ feature importance extraction
  • Model evaluation and comparison