Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename and fix overfit_pct, val_percent_check and test_percent_check #2213

Merged
merged 31 commits into from
Jun 17, 2020

Conversation

williamFalcon
Copy link
Contributor

@williamFalcon williamFalcon commented Jun 16, 2020

Fixes #1668
Fixes #1920
Fixes #830

Overfit

  • Removes overfit_pct
  • Adds overfit_batches which can be a float (%) or int (steps)
  • overfit now makes sure train, test and val splits are all the train step
  • sets shuffle=False in the train set

limit_xxx_batches

  • removes train_percent_check and val_percent_check
  • replaces with limit_train_batches and limit_val_batches which can also be int (steps) or percent (float)

docs + tests

  • adds tests to verify
  • updates docs

@tullie these flag names make sense?

TODO: also replace train_percent_check with limit_train_batches

@pep8speaks
Copy link

pep8speaks commented Jun 16, 2020

Hello @williamFalcon! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-17 11:54:19 UTC

@awaelchli
Copy link
Member

@williamFalcon are you trying to fix this: #1920 ?

@williamFalcon
Copy link
Contributor Author

@awaelchli partly. also enabling this to take a number of batches instead of just a percent.

@williamFalcon williamFalcon changed the title Pctc [WIP] Pctc Jun 17, 2020
@Borda Borda added the bug Something isn't working label Jun 17, 2020
@Borda Borda self-requested a review June 17, 2020 00:18
@williamFalcon williamFalcon changed the title [WIP] Pctc [WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) Jun 17, 2020
@williamFalcon williamFalcon changed the title Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) [WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) Jun 17, 2020
@awaelchli
Copy link
Member

besides the new Trainer flags, it's not easy to see what's the difference to #1920. It seems you have re-implemented it.

@Borda
Copy link
Member

Borda commented Jun 17, 2020

@williamFalcon are you trying to fix this: #1920 ?

@awaelchli partly. also enabling this to take a number of batches instead of just a percent.

so shall #1920 be merged first? otherwise, it may raise plenty of conflicts...

@Borda Borda changed the title [WIP] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) [WIP, blocked by #1920] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) Jun 17, 2020
@Borda Borda changed the title [WIP, blocked by #1920] Rename overfit_pct to overfit_batches (and fix) and val_percent_check and test_percent_check (and fix) [WIP, blocked by #1920] Rename and fix overfit_pct, val_percent_check and test_percent_check Jun 17, 2020
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's merge #1920 first 🐰

@mergify mergify bot requested a review from a team June 17, 2020 07:12
@mergify
Copy link
Contributor

mergify bot commented Jun 17, 2020

This pull request is now in conflict... :(

@williamFalcon williamFalcon changed the title [WIP, blocked by #1920] Rename and fix overfit_pct, val_percent_check and test_percent_check Rename and fix overfit_pct, val_percent_check and test_percent_check Jun 17, 2020
@codecov
Copy link

codecov bot commented Jun 17, 2020

Codecov Report

Merging #2213 into master will decrease coverage by 0%.
The diff coverage is 88%.

@@          Coverage Diff           @@
##           master   #2213   +/-   ##
======================================
- Coverage      88%     88%   -0%     
======================================
  Files          69      69           
  Lines        5269    5289   +20     
======================================
+ Hits         4625    4637   +12     
- Misses        644     652    +8     

@williamFalcon williamFalcon merged commit 04c794c into master Jun 17, 2020
@Borda
Copy link
Member

Borda commented Jun 17, 2020

btw, this was not reviewed as @awaelchli @Borda asked to rebase it on #1920
btw2, there are many unrelated changes, probably artefacts from merging master

@tullie
Copy link
Contributor

tullie commented Jun 17, 2020

I wonder if overfit_batches can be removed entirely. It's good to strive for an API where there's only one way to do things, and I feel like the problem with these trainer args is that it's not clear what the difference between setting overfit_batches=1 or limit_val_batches=1 and limit_train_val=1 is?

Regardless, this PR is still an improvement to how it was before. Is there any reason you didn't put deprecation warnings in?

@williamFalcon
Copy link
Contributor Author

This was a two-part PR.
The warnings are here #2220

This was the time to remove overfit but i realized it actually serves a different approach.

overfit_batches will actually use the train set for train, val and test so you can truly overfit. It also removes the shuffle from the training set so you iterate over the same exact stuff.

However, the other 3 flags don't serve that purpose. Those are more for only using some of the data for training, val or test (likely for debugging purposes)

@pwl
Copy link
Contributor

pwl commented Jun 19, 2020

Hi, I'm getting an error

~/.local/lib/python3.7/site-packages/pytorch_lightning/callbacks/progress.py in total_val_batches(self)
     99             is_val_epoch = trainer.current_epoch % trainer.check_val_every_n_epoch == 0
    100             total_val_batches = trainer.num_val_batches if is_val_epoch else 0
--> 101             total_val_batches = sum(total_val_batches)
    102         return total_val_batches
    103 

TypeError: 'int' object is not iterable

from a line added in this PR: 04c794c#diff-c611176c04f224f1ec16a768aec9c527R101

Turns out sum(0) doesn't work in python, so maybe it should be

total_val_batches = sum(trainer.num_val_batches) if is_val_epoch else 0

I'm writing this here instead in a new issue, as this is a very fresh PR, I hope that's ok. Also, I wonder how this got past the tests.

@Borda
Copy link
Member

Borda commented Jun 19, 2020

@pwl good catch, mind sending a fix PR?

pwl added a commit to pwl/pytorch-lightning that referenced this pull request Jun 19, 2020
Fixes a minor bug introduced in Lightning-AI#2213
Borda pushed a commit to pwl/pytorch-lightning that referenced this pull request Jun 19, 2020
Fixes a minor bug introduced in Lightning-AI#2213
williamFalcon pushed a commit that referenced this pull request Jun 19, 2020
Fixes a minor bug introduced in #2213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ditch Trainer percent arguments, make overfit great again overfit_pct vs train_percent_check etc
6 participants