Skip to content

Udacity Machine Learning Engineer Nanodegree - Capstone Project on customer segmentation and acquisition

Notifications You must be signed in to change notification settings

fningtian/Bertelsmann-Arvato-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Customer Segmentation and Acquisition - Bertelsmann Arvato (Jun 2021)

Machine Learning Engineer Udacity Nanodegree - Capstone Project

This GitHub repository hosts the code and report for "Capstone project - Arvato Customer Segmentation" that I developed and completed as part of Udacity Machine Learning Engineer Nanodegree program.

In this project, I have employed the use of supervised and unsupervised machine learning algorithms to deal with real-life data provided by Bertelsmann Arvato Analytics. More specifically, I have worked on 4 demographics datasets and 2 metadata files provided by Arvato Financial Services with the goal of helping a client mailorder company target next probable customers.

Tables of Contents

  • Data Description
  • Technical Overview
  • Requirements
  • Results
  • Acknowledgements
  • Author

Data Description

Demographics Data:

Customer Segmentation

  • General Population demographics
  • Customer demographics

Customer Acquisition

  • Training data
  • Test data

Metadata providing attribute information:

  • DIAS Information Levels - Attributes
  • DIAS Attributes - Values

Technical Overview

The project have been divided into the following steps:

  • Data Exploration and Pre-processing
  • Feature Engineering
  • Dimensionality Reduction
  • Clustering
  • Selection of Supervised Learning Models
  • Model Tuning
  • Model Evaluation
  • Predictions on the Test Dataset
  • Submission to Kaggle

Details are in Report.pdf

Requirements

The Jupyter Notebook is written in Python (3.x. version required).

The required libraries for this project are in the requirement.txt file.

The main packages include: numpy, pandas, matplotlib, seaborn, scikit-learn, lightgbm and xgboost.

Results

The results have been well docomented in the Jupiter Notebook. Please refer Arvato Project Workbook.ipynb

Acknowldgements

I would like to thank the commitment of Udacity for presenting me to this Capstone project and Arvato Financial Services for providing the real-life data.

The syllabus of this Machine Learning Nanodegree Program is here

Author

Funing Tian

Contact: here

Email: tian.570@osu.edu

Releases

No releases published

Packages

No packages published