Ditch Trainer percent arguments, make overfit great again #1668

iakremnev · 2020-04-29T20:47:58Z

🚀 Feature

Replace *_percent_check with limit_*_batches and redesign overfit_pct Trainer arguments.

Motivation

Over the past few days I was struggling to use these parameters in my tasks. For a example, I want to run a test on model overfitting for an image classification problem. I would take a few (say, 2) batches from my train dataset and train several epochs on them. Then I would assert 100% accuracy on these train batches.

The problems I stumbled on are the following:

overfit_pct documentation is misleading. Recently a clarification was made that it sets *_percent_check parameters to a given value, but it still doesn't actually help to overfit a model since you can't simply run trainer.test() or trainer.run_evaluation() without manipulating model's dataloaders after running trainer.fit(model).
If the value of the val_percent_check is too small, which actually can happen if you use overfit_pct with small training dataset in mind, you just silently skip validation loop and run into an exception in model.validation_epoch_end trying to accumulate loss for batches. Yeah, handling the latter is reasonably on me since I override this method but it will be much nicer if such unexpected loop-skip is checked by Pytorch-Lightning. You guys are great and I want to love your project even more!
train_percent_check doesn't guarantee training on the same small part of the training dataset for every epoch because it is a best practice to shuffle your training data every epoch. As a result, new batches are formed every epoch and thus no overfitting :(

Pitch

I find redesigning these dataset limiting parameters as absolute number of batches more straightforward and desired than now. After all, I dare to say that most of the researches and developers are thinking in terms of number of bathes rather than percent of dataset.
overfit_pct is either removed or actually redesigned to help test overfitting, i.e. replacing validation or test loader with train loader. Ensure that training dataset isn't shuffled and the same batches are trained on every epoch.

Additional notes

I realize that you cannot prohibit shuffling in case of using simply *_percent_check parameters. There can be experiments where you would like to see how your model performs training only on a portion of data. Therefore such prohibition is valid only for overfit mode.

The text was updated successfully, but these errors were encountered:

tshrjn · 2020-04-30T10:13:57Z

Couldn't agree more with the shuffling part, the overfit_pct doesn't control shuffle flag for the train_dataloader. I think that should really be set by lightning & not by the user to make this flag as as the do-it-all as it claims for overfitting.

williamFalcon · 2020-06-16T23:59:11Z

Fixed!
Will be available in 0.8.0

williamFalcon · 2020-06-17T13:07:50Z

All of these were fixed in #2213 and #2220.

Thanks for the suggestions! keep them coming.

iakremnev added feature Is an improvement or enhancement help wanted Open to be worked on labels Apr 29, 2020

This was referenced Jun 16, 2020

Rename and fix overfit_pct, val_percent_check and test_percent_check #2213

Merged

overfit_pct now uses train loaders for val and test and does not shuffle #2214

Closed

williamFalcon closed this as completed in #2213 Jun 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ditch Trainer percent arguments, make overfit great again #1668

Ditch Trainer percent arguments, make overfit great again #1668

iakremnev commented Apr 29, 2020 •

edited

Loading

tshrjn commented Apr 30, 2020

williamFalcon commented Jun 16, 2020 •

edited

Loading

williamFalcon commented Jun 17, 2020 •

edited

Loading

Ditch Trainer percent arguments, make overfit great again #1668

Ditch Trainer percent arguments, make overfit great again #1668

Comments

iakremnev commented Apr 29, 2020 • edited Loading

🚀 Feature

Motivation

Pitch

Additional notes

tshrjn commented Apr 30, 2020

williamFalcon commented Jun 16, 2020 • edited Loading

williamFalcon commented Jun 17, 2020 • edited Loading

iakremnev commented Apr 29, 2020 •

edited

Loading

williamFalcon commented Jun 16, 2020 •

edited

Loading

williamFalcon commented Jun 17, 2020 •

edited

Loading