EDA & prediction of customer churn at Orange with the use of Machine Learning. This project was done for practice in March 2020 using Google Colab.
The repository includes:
- colab_prediction.ipynb – Python notebook file extracted from Google Colab
- data
- orange_score.csv – dataset of 10k rows to make the ML model
- orange_train.csv – dataset of 100k rows to do the churn prediction
- results
- list_of_all_customers_likely_to_churn.csv – list of all the 100k customers with the
probability_of_churn
score - list_of_the_100_customers_most_likely_to_churn.xlsx – the same list but limited to 100 rows in a descending order
- pandas_profiling_report.html – Pandas Profiling report (as a quick/automated EDA)
- presentation_to_the_business.pdf – 5-slide presentation about the findings for the Orange CEO
- technical_report.pdf – 31-page technical report explaining the dataset, implemented approach, experimental setup, results and providing the final conclusions.
- list_of_all_customers_likely_to_churn.csv – list of all the 100k customers with the