Skip to content

julian-douglas/House-Prices-Regression

Repository files navigation

Predicting House Prices - Kaggle Competition

This notebook is my submission for the Kaggle House Prices Competition, which contains a dataset containing the the features of 1460 houses, plus their selling prices.

This notebook uses several different regression methodologies in order to predict the price of a house given its features (its area, how many bedrooms it has, the year it was built, etc). My methodology is outlined as follows:

  • Load the dataset into a Pandas dataframe
  • Feature selection:
    • Identify which numerical features to keep based on their correlation with the sales price
    • Identify which features to keep from those based on their correlation with each other
    • Identify which categorical features to keep
  • Handle missing data
  • Feature Engineering
  • Data Pre-Processing:
    • One-hot encode the categorical features
    • Standardise and normalise the numerical features
  • Perform different regression models on the data and compare their performance (in terms of MRSE):
    • Linear Regression
    • Ridge Regression
    • Lasso Regression
    • Random Forest
    • Support Vector Machine
    • Extreme Gradient Boosting
  • And finally, make the submission on the test data.

About

My submission for the house prices competition on Kaggle.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages