Skip to content

My best submission to the Kaggle competition "Online Product Sales", ranked 21th over 366 teams.

Notifications You must be signed in to change notification settings

emanuele/kaggle_ops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kaggle_ops

My best submission to the Kaggle competition "Online Product Sales", ranked 21th over 366 teams (score: 0.57885).

http://www.kaggle.com/c/online-sales/leaderboard


Requirements:

  1. NumPy, scikit-learn

  2. Pandas, http://pandas.pydata.org/ , just to create the initial dataset or to explore it.

  3. joblib, http://packages.python.org/joblib/ , if you want to run blender_parallel.py , i.e. to use all your cores with GradientBoostingRegressor().

  4. The "Online Product Sales" trainset/testset files to be put in the subdirectory "data/".


Usage:

  • "python explore.py" to have a look to the quantitative variables and to decided which of them to put in the logscale.

  • "python create_dataset.py" , creates and save the dataset from the initial trainset/testset files.

  • "python blender.py" computes the actual submission (simple blending of GradientBoosting).

  • "python blender_parallel.py" computes the actual submission splitting the computation on as many cores as you like.

About

My best submission to the Kaggle competition "Online Product Sales", ranked 21th over 366 teams.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published