Telco Customer Churn

Introduction

Churn prediction is the process of identifying which consumers are most likely to stop using a service or to cancel their subscription. It is important to be able to predict customers churn because it is actually expensive to get new customers than keeping existing ones. Once we pinpointed the clients who are most likely to churn, we could find marketing strategy to increase the likelihood that the client will stay.

Objective

Predict customers churn and analyze the factors that cause customer churn in the company.

Data Understanding

The raw data contains 7043 rows (customers) and 21 columns (features), which are:

customerID: Customer ID
gender: Whether the customer is a male or a female
SeniorCitizen: Whether the customer is a senior citizen or not (1, 0)
Partner: Whether the customer has a partner or not (Yes, No)
Dependents: Whether the customer has dependents or not (Yes, No)
tenure: Number of months the customer has stayed with the company
PhoneService: Whether the customer has a phone service or not (Yes, No)
MultipleLines: Whether the customer has multiple lines or not (Yes, No, No phone service)
InternetService: Customer’s internet service provider (DSL, Fiber optic, No)
OnlineSecurity: Whether the customer has online security or not (Yes, No, No internet service)
OnlineBackup: Whether the customer has online backup or not (Yes, No, No internet service)
DeviceProtection: Whether the customer has device protection or not (Yes, No, No internet service)
TechSupport: Whether the customer has tech support or not (Yes, No, No internet service)
StreamingTV: Whether the customer has streaming TV service or not (Yes, No, No internet service)
StreamingMovies: Whether the customer has streaming Movies service or not (Yes, No, No internet service)
Contract: Type of contract (Month-to-month, One year, Two year)
PaperlessBilling: Whether the customer use paperless billing or not (Yes, No)
PaymentMethod: Type of payment method (Electronic check, Mailed check, Bank transfer (automatic), and Credit card (automatic))
MonthlyCharges: How much is their monthly charges
TotalCharges: How much is their total charges
Churn: Our target feature, whether the customers churn or not (Yes, No)

Methods

We will use various methods such as random forest with and without SMOTE (Synthetic Minority Oversampling Technique) and also XGBoost to make the classification modeling. Due to imbalance distribution on feature targets, we cannot use accuracy as evaluation metrics. We use recall (sensitivity) because we want to know which customer who most likely to churn.

Conclusion

The data does not contain major issues. There is no NULL values and duplicated rows. Overall, the minimum and maximum values make sense for each column.
Most of the columns with continuous numerical values are asymmetric and from the boxplot we can not found any outlier in any numerical columns.
In terms of the target variable, there is an imbalance distribution.
There are several tendency of customers who are more likely to churn:

they don't have partner
they don't have dependants
they has phone service and use fiber optic as internet service
they didn't subscibe to any extra services
they has contract month-to-month basis
they chose Paperless Billing
they used Electronic check
they are non senior citizen
they have shorter time subscribe the service (smaller tenure)
they have higher MonthlyCharges
they have lower TotalCharges

After doing the modeling, we can see that XGBoost is the most appropriate model for imbalanced data. But it should be noted, we have not done feature selection here at all. With feature selection, the result could be improved.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
Telco-Customer-Churn.csv		Telco-Customer-Churn.csv
Telco-Customer-Churn.ipynb		Telco-Customer-Churn.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Telco Customer Churn

Introduction

Objective

Data Understanding

Methods

Conclusion

About

Releases

Packages

Languages

License

ulyazmah/telco-churn

Folders and files

Latest commit

History

Repository files navigation

Telco Customer Churn

Introduction

Objective

Data Understanding

Methods

Conclusion

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages