Skip to content

ttozatto/sparkify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sparkify - Churn Prediction for music streaming app with PySpark

This repository is part of the final project submited to Udacity for the Data Science Nanodegree. The objective is to predict churn, from a simulated music streaming app, using historical data from user interactions.

A blog post with a detailed analysis is available at https://medium.com/@ttozatto.ds/churn-prediction-for-music-streaming-app-sparkify-d6e26d1ac80f

Dependencies

  • pyspark
  • matplotlib

Files

Summary of Results

Test Scores

results_medium

Parameters for best models

bestModel

Feature importance

feature_importance

Aknowledgements:

I would like to pay my special regards to:

  • Udacity, that proposed this work in the Data Science Nanodegree.
  • Spark team and community, that provides a powerful opensource tool to everyone.