Skip to content

Machine learning models to improve AUC scores in large-scale, highly skewed datasets.

Notifications You must be signed in to change notification settings

emilbiju/Ad-Click-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Ad Click Prediction

This is a project aimed at predicting the probability of a user clicking an ad displayed on the website. The dataset contains various features, including timestamp, website ID, browser information, offer ID, etc. pertaining to the advertisements displayed and the users' response. The challenge is to build a classification model that correctly predicts whether a user will click an ad, under various circumstances.

The primary issue with the dataset is that the Target feature is highly skewed with the number of negative samples being almost 100 times the number of positive samples available.We first try conventional approaches using various standard models, with the objective of achieving a high AUC score. The models that have been used include:

  • XGBoost
  • LightGBM
  • Random Forest Classifier

Following this, we follow two other approaches that are tailored to address the specific problem of the skewed dataset.

  • Logistic Regression with a custom cost function and gradient update rule
  • XGBoost with a modified dataset to counter skewness

About

Machine learning models to improve AUC scores in large-scale, highly skewed datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published