Skip to content

Short term prediction

Pony Biam! edited this page Jun 7, 2020 · 7 revisions

A simple short term predictive LightGBM model can be found in this notebook. The model was trained with 30 days in order to predict 3 days. This was performed for winter and summer.

shortterm-split

Features

Based on the exploratory data analysis a simple feature engineering was performed. Based on EDA of meter readings:

  • Healthcare, Food sales and services and Utility usages shows the highest meter reading values.
  • Hotwater meter shows the highest meter reading values.
  • Monthly behaviour (meter-reading median) shows higher readings in warm season.
  • Hourly behaviour (meter-reading median) shows higher values from 6 to 19 hs.
  • Weekday behaviour: lowers during weekends.

In the following section can be found the features selected, transformed and created.

Selection

the following features were selected from each data set:

  • Building metadata
    • Building ID*
    • Site ID*
    • Primary space usage
    • Building size (sqm)
  • Weather data
    • Timestamp*
    • Site ID*
    • Air temperature
  • Meter reading data
    • Timestamp*
    • Building ID*
    • meter
    • meter reading (target)

Transformation

The following features were transformed:

  • primaryspaceusage categories (16) were reduced to food sales and services, healthcare, utility and other
  • meter categories (8) were preserved

Creation

The following features were created:

  • day of the week
  • hour of the day

Final features

  • Timestamp*
  • Site ID
  • Building ID
  • Hour
  • Day of the week
  • Usage (4 levels: healthcare, food, utility, other)
  • Building size (sqm)
  • Air temperature
  • Meter (8 levels)
  • Meter reading / target

Parameters

Parameters for this model were not tuned, but were manually modified to perform better than default.

  • "objective": "regression"
  • "metric": "rmse"
  • "random_state": 55
  • "learning_rate": 0.01, (default 0.1)
  • "max_bin": 761 (default 255)
  • "num_leaves": 2197 (default 31)

Results

Performance, as expected, was poor for this model. It can be used as baseline for more complex models.

Winter

shortterm-winter-plot1
Figure 1: meter_reading real values and predicted with short-term winter model v. timestamp.

shortterm-winter-plot2
Figure 2: meter_reading predicted with short-term winter model v. real values.

meter/metric RMSE RMSLE CVRMSE MBE R2
all 63793.3281 3.585 893.9526 -1.3002 -0.343
electricity 519.7323 2.9424 365.1851 -204.7482 -3.3066
water 2870.6986 4.1811 370.3285 25.1321 -0.0163
chilledwater 102821.528 3.9423 545.8142 10.6727 -0.2252
hotwater 186015.753 5.4836 334.6407 -8.7227 -0.7154
gas 3242.0113 5.3402 385.3021 -6.2478 -0.3415
steam 2473.409 2.5389 290.4014 -15.6519 -0.5368
solar 809.158 5.5228 1504.3307 -1199.6183 -58.916
irrigation 1844.7737 5.7132 470.3007 -40.5796 -0.1421

Table 1: metrics for the short-term winter model, calculated for all meters alltogether and for each one.

Summer

shortterm-summer-plot1
Figure 1: meter_reading real values and predicted with short-term summermodel v. timestamp.

shortterm-summer-plot2
Figure 2: meter_reading predicted with short-term summer model v. real values.

meter/metric RMSE RMSLE CVRMSE MBE R2
all 160223.041 5.1053 1076.0289 -9.3812 -0.4323
electricity 3840.3759 4.8381 2602.8031 -2593.092 -187.2267
water 3852.6936 6.3158 999.4388 -957.9767 -14.316
chilledwater 373663.17 3.5728 504.8544 10.4382 -0.5017
hotwater 56641.9717 6.4026 286.6051 12.3044 -0.1629
gas 4284.2112 6.7518 751.958 -627.9519 -2.9741
steam 4008.5469 5.5218 1399.8411 -1331.0957 -13.0241
solar 3899.8942 7.4042 82942.7713 -82941.967 -251333.771
irrigation 26549.0901 8.3586 12386.1993 -5364.6018 -342.9041

Table 1: metrics for the short-term summer model, calculated for all meters alltogether and for each one.