Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 1.14 KB

File metadata and controls

18 lines (13 loc) · 1.14 KB

Handle missing values in Categorical Features

python

The purpose of this project is to show different ways to deal with missing values on categorical features. I have used the Classified Ads for Cars dataset from Kaggle to predict the price of ADs through a simple model of Linear Regression.

In order to show the various strategies and relevants pros / cons, we will focus on a particular categorical feature of this dataset, the maker, the name of the brand of cars (Toyota, Kia, Ford, Bmw, ...).

We will cover the following techniques:

  • Replace missing values with the most frequent values.
  • Delete rows with null values.
  • Predict values using a Classifier Algorithm (supervised or unsupervised)

Links: