Predicting Default Risk - Creditworthiness

Classification Modelling and Performance Comparison

by Sooyeon Won

Keywords

Analytical Framework
Supervised Learning
Binary Classification Models
- Logistic Regression Model
- Decision Tree Model
- Random Forest Model
- Gradient Boosting Model
- AdaBoost Model
ROC curve
K-fold cross-validation
Model Selection and its Application

Summary of Findings

A bank recently received an influx of loan applications. I built and apply different classification models to provide an appropriate recommendation. Among the various models, I found out that Random Forest Model is the most suitable for predicting the creditworthiness of credit applicants. Based on the Random Forest model, I concluded 408 customers belong to the segment 'Creditworthy’.

This analysis is separated into three parts.

The first part of the analysis (Filename: 01_Creditworthiness_Data_Cleaning_Exploration.ipynb) contains the process of Data Cleaning and Data Exploration with appropriate visualizations. At the end of the Part1, I select the features associated with the creditworthiness of customers, which is the target variable of the analysis.
In the second part of the analysis (Filename: 02_Creditworthiness_Data_Analysis.ipynb), I trained the dataset with various classification models: Logistic Regression, Decision Tree, Random Forest, Gradient Boosted Model, and AdaBoost Model. Then they are validated with the test dataset. Additionally, I evaluate the performance of the models with k-fold cross validation techniques.
After selecting the model with best performance, I conclude that how many new credit applicants are classified as creditworthy customers.

References

Logistic Regression (aka logit, MaxEnt) classifier - sklearn
How to Calculate Feature Importance With Python
Decision Tree Classifier - sklearn
Categorical Features and Encoding in Decision Trees

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
01_Creditworthiness_Data_Cleaning_Exploration.ipynb		01_Creditworthiness_Data_Cleaning_Exploration.ipynb
02_Creditworthiness_Data_Analysis-v2.ipynb		02_Creditworthiness_Data_Analysis-v2.ipynb
03_Creditworthiness_Conclusion.ipynb		03_Creditworthiness_Conclusion.ipynb
README.md		README.md
credit-data-training-cleaned.xlsx		credit-data-training-cleaned.xlsx
credit-data-training.xlsx		credit-data-training.xlsx
customers-to-score.xlsx		customers-to-score.xlsx
model_rf.pickle		model_rf.pickle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Default Risk - Creditworthiness

by Sooyeon Won

Keywords

Summary of Findings

References

About

Releases

Packages

Languages

SooyeonWon/predicting_default_risk

Folders and files

Latest commit

History

Repository files navigation

Predicting Default Risk - Creditworthiness

by Sooyeon Won

Keywords

Summary of Findings

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages