Add the possibility to use cross validation when training PyAF models #105

antoinecarme · 2018-09-25T09:43:57Z

Following the investigation performed in #53, implement a form of cross validation for PyAF models.

Specifications :

Cut the dataset in many folds according to a scikit-learn time series split :
http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation
number of folds => user option (default = 10)
To have enough data, use only the last n/2 folds for estimating the models (thanks to forecast R package ;). The default splits look like this :
[5 ] [6]
[5 6 ] [7]
[5 6 7] [8]
[5 6 7 8] [9]
[5 6 7 8 9] [10]
Use the model decomposition type or formula as a hyperparameter and optimize it. select the decomposition(s) with the lowest mean MAPE on the validation datasets of all the possible splits.
Among all the chosen decompositions, select the model with lowest complexity (~ number of inputs)
Execute the procedure on the ozone and air passengers datsets and compare with the non-cross validation models (=> 2 jupyter notebooks)

antoinecarme · 2018-09-25T10:26:30Z

Classical PyAF modeling is a special case of this cross validation with 1 split (nfolds =5 , split = [1 2 3 4] [5] ). So the implementation should be made by adapting the existing code. Training each one of the splits is equivalent to training an old model.

NowanIlfideme · 2018-09-25T10:44:18Z

Hi, I've been watching your project for a while (mostly - I have been working on a similar project, which comes at this from a different perspective 😛).
I'd just like to note that, from the business case, there are (at least) 2 different kinds of time series CV: with and without retraining on the set. The first one (that you've described above) is useful for settings where you can constantly re-train your model. The second one is for when you don't have the ability to re-train, but want to know what it will do on future, shorter folds. This is relevant for models with hidden components (e.g. ARIMA, state-space models, RNN's, ...) where the state can be much different when starting later than from the beginning (as an analogy, a Markov chain that isn't yet in the stationary distribution).

antoinecarme · 2018-09-25T11:10:12Z

@NowanIlfideme

Thanks a lot for your interest in PyAF. Comments like these a re always welcome. Hope you enjoyed.

Models with state/hidden components are not yet supported but if you look closely, PyAF is always evolving, Cross validation work started a year ago, its first implementation will be available in the few coming weeks.

Can you please elaborate a little bit more on the second case (python example in a gist ?). Any docs/references ?

NowanIlfideme · 2018-09-25T11:41:43Z

I don't quite have the time to make a full example, I hope a block thing will work. :)

Full Set:
[1 2 ... N N+1 ... 2N]

Train (same for all):
[1 2 3 ... N]

Validation:
Sees [1 ... N], predicts [N+1]
Sees [2 ... N+1], predicts [N+2]
...
Sees [N-1 ... 2N-1], predicts [2N]

If you only use stateless models, this is the same as validating on sets [N+1, ... 2N]. However, for stateful models, this means you will always be using [N*num_per_set] steps to "warm up" your model, and thus get consistent behavior (you'd do this in production, as well).

As an alternative, you could use the following scheme for stateless models as well:

Trains on [1 ... N], predicts [N+1]
Trains on [2 ... N+1], predicts [N+2]
...
Trains on [N-1 ... 2N-1], predicts [2N]

This will always give a "window", and again be consistent. However, the end use of these methods is different. 😃

antoinecarme · 2018-09-25T12:14:37Z

The block thing is clear and very interesting ;). Will keep this aside for implementing support for stateful models.

Do you have any book reference for this kind of stuff ? putting time series models in production etc.

NowanIlfideme · 2018-09-25T12:17:09Z

I'm going mainly by experience, sorry that I can't give any written reference. Cheers!

antoinecarme · 2018-09-25T12:20:14Z

Cheers!

antoinecarme · 2018-09-25T12:25:47Z

@NowanIlfideme

What about summarizing your experience in a github repository (markdown) ? I am also not aware of a written reference for this kind of stuff. Please think of this when you have some time.

Thanks a lot.

…105 Option. Cross validation control.

…105 The etst dataset is optional.

…105 The Test dataset is optional

…105 Added separate cSignalDecompositionTrainer and cSignalDecompositionTrainer_CrossValidation

…105 Added two tests for cross validation.

antoinecarme · 2018-09-26T09:52:40Z

This is how to adapt the training process to activate the cross validation in PyAF (with 7 folds) :

    import pyaf.ForecastEngine as autof
    lEngine = autof.cForecastEngine()
    lEngine.mOptions.mCrossValidationOptions.mMethod = "TSCV";
    lEngine.mOptions.mCrossValidationOptions.mNbFolds = 7
    lEngine.train(ozone_dataframe , 'Month' , 'Ozone', 12);
    lEngine.getModelInfo();

…105 Added a jupyter notebook with ozone case

…105 Added a jupyter notebook with air passengers case

Add the possibility to use cross validation when training PyAF models #105

antoinecarme · 2018-09-28T07:27:50Z

FIXED!!!!

antoinecarme mentioned this issue Sep 25, 2018

Investigate cross-validation methods for time series #53

Closed

antoinecarme self-assigned this Sep 25, 2018

antoinecarme added a commit that referenced this issue Sep 26, 2018

Add the possibility to use cross validation when training PyAF models #…

625a562

…105 Option. Cross validation control.

antoinecarme added a commit that referenced this issue Sep 26, 2018

Add the possibility to use cross validation when training PyAF models #…

2fc58a9

…105 The etst dataset is optional.

antoinecarme added a commit that referenced this issue Sep 26, 2018

Add the possibility to use cross validation when training PyAF models #…

0a587d5

…105 The Test dataset is optional

antoinecarme added a commit that referenced this issue Sep 26, 2018

Add the possibility to use cross validation when training PyAF models #…

22c30b9

…105 Added separate cSignalDecompositionTrainer and cSignalDecompositionTrainer_CrossValidation

antoinecarme added a commit that referenced this issue Sep 26, 2018

Add the possibility to use cross validation when training PyAF models #…

47e90e5

…105 Added two tests for cross validation.

antoinecarme added a commit that referenced this issue Sep 27, 2018

Add the possibility to use cross validation when training PyAF models #…

c401915

…105 Added a jupyter notebook with ozone case

antoinecarme added a commit that referenced this issue Sep 27, 2018

Add the possibility to use cross validation when training PyAF models #…

bf248b9

…105 Added a jupyter notebook with air passengers case

antoinecarme added a commit that referenced this issue Sep 27, 2018

Merge pull request #104 from antoinecarme/ts_cross_val

264106f

Add the possibility to use cross validation when training PyAF models #105

antoinecarme added class:enhancement priority:high topic:modeling_quality status:in_progress labels Sep 28, 2018

antoinecarme closed this as completed Sep 28, 2018

antoinecarme removed the status:in_progress label Nov 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the possibility to use cross validation when training PyAF models #105

Add the possibility to use cross validation when training PyAF models #105

antoinecarme commented Sep 25, 2018 •

edited

Loading

antoinecarme commented Sep 25, 2018 •

edited

Loading

NowanIlfideme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

NowanIlfideme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

NowanIlfideme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

antoinecarme commented Sep 26, 2018 •

edited

Loading

antoinecarme commented Sep 28, 2018

Add the possibility to use cross validation when training PyAF models #105

Add the possibility to use cross validation when training PyAF models #105

Comments

antoinecarme commented Sep 25, 2018 • edited Loading

antoinecarme commented Sep 25, 2018 • edited Loading

NowanIlfideme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

NowanIlfideme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

NowanIlfideme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

antoinecarme commented Sep 25, 2018

antoinecarme commented Sep 26, 2018 • edited Loading

antoinecarme commented Sep 28, 2018

antoinecarme commented Sep 25, 2018 •

edited

Loading

antoinecarme commented Sep 25, 2018 •

edited

Loading

antoinecarme commented Sep 26, 2018 •

edited

Loading