Employee-Retention---Random-Forest-Classifier

This project is to analyze a dataset and build predictive models that can provide insights to the Human Resources (HR) department of a large consulting firm.

Employee Retention: Providing data-driven suggestions for HR

Background Salifort Motors

This project is to analyze a dataset and build predictive models that can provide insights to the Human Resources (HR) department of a large consulting firm.

Salifort’s senior leadership team is concerned about how many employees are leaving the company. Salifort strives to create a corporate culture that supports employee success and professional development. Further, the high turnover rate is costly in the financial sense. Salifort makes a big investment in recruiting, training, and upskilling its employees. As a first step, the leadership team asks Human Resources to survey a sample of employees to learn more about what might be driving turnover. The dataset that will be using in this lab contains 15,000 rows and 10 columns for the variables .Dataset avaialble on Kaggle. We used PACE workflow here to structure the analysis and modeling

Process

Plan Stage

Step 1. Imports

Import packages Load dataset

Step 2. Data Exploration (Initial EDA and data cleaning)

Understand your variables Clean dataset (missing data, redundant data, outliers)

Analyse Stage

Step 3. Data Exploration

EDA Continuation and Visualisation

Construct Stage

Step 3. Model Building

Fit a model that predicts the outcome variable using two or more independent variables Logistic Regression,Decision Trees and Random Forest Check model assumptions Evaluate the model

Execute Stage

Step 4. Results and Evaluation

Interpret model Evaluate model performance using metrics Prepare results, visualizations, and actionable steps to share with stakeholders

Fit a model that predicts the outcome variable using two or more independent variables Check model assumptions Evaluate the model

Summary

The logistic regression model achieved precision of 79%, recall of 82%, f1-score of 80% (all weighted averages), and accuracy of 82%, on the test set.

After conducting feature engineering, the random forest model achieved ruc_auc score of 93.72%, precision of 95.91%, recall of 85.71%, f1-score of 90.45%, and accuracy of 88.01%, on the test set. The random forest modestly outperformed the decision tree model.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
Google_Project_Final.ipynb		Google_Project_Final.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Employee-Retention---Random-Forest-Classifier

Employee Retention: Providing data-driven suggestions for HR

Background Salifort Motors

Process

Plan Stage

Step 1. Imports

Step 2. Data Exploration (Initial EDA and data cleaning)

Analyse Stage

Step 3. Data Exploration

Construct Stage

Step 3. Model Building

Execute Stage

Step 4. Results and Evaluation

Summary

About

Releases

Packages

Languages

JilsyXavier/Employee-Retention---Random-Forest-Classifier

Folders and files

Latest commit

History

Repository files navigation

Employee-Retention---Random-Forest-Classifier

Employee Retention: Providing data-driven suggestions for HR

Background Salifort Motors

Process

Plan Stage

Step 1. Imports

Step 2. Data Exploration (Initial EDA and data cleaning)

Analyse Stage

Step 3. Data Exploration

Construct Stage

Step 3. Model Building

Execute Stage

Step 4. Results and Evaluation

Summary

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages