Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support limit_mode_batches (int) for infinite dataloader #2787

Merged
merged 15 commits into from
Aug 5, 2020

Conversation

rohitgr7
Copy link
Contributor

@rohitgr7 rohitgr7 commented Aug 1, 2020

What does this PR do?

Fixes #2649

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@mergify mergify bot requested a review from a team August 1, 2020 11:12
raise MisconfigurationException(
'When using an infinite DataLoader (e.g. with an IterableDataset'
f' or when DataLoader does not implement `__len__`) for `limit_{mode}_batches`,'
f' `Trainer(limit_{mode}_batches)` must be `0.0`, `1.0` or `int`')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can limit_mode_batches = 0.0 or 1.0 for infinite dataloader?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, a value of 0.0 should just run nothing and a value of 1.0 should run until the dataloader stops. If it is an actually infinite dataloader then it would just never end, so perhaps this should be written differently?

Copy link
Contributor Author

@rohitgr7 rohitgr7 Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, as of now I just moved some if-else statements and with 0.0 it was neither working before nor it is working now. I can change it as per requirement. But just like you said I agree it should work with both 0.0 and int and otherwise raise this exception with an updated statement here.

Copy link
Contributor Author

@rohitgr7 rohitgr7 Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and not just 1.0 but with every 0.0 < limit_mode_batches <= 1.0 it should raise this exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awaelchli what do you suggest here? I'll update and add tests accordingly.

Copy link
Member

@awaelchli awaelchli Aug 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the same for val_check_interval=1.0 (but we don't print a warning), where we distinguish between 1.0 (float, 100% of epoch) and 1 (int, batch count). I think it is reasonable to differentiate using the type and the docs make it very clear with the examples. Regarding this exception message, I think it should be changed to "When using an IterableDataset (e.g. of infinite size ...". Or just not refer to inf dataset as Ethan suggested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and for non-iterable datasets should 0.0 be supported with inf dataloader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and separate exceptions for Iterable and inf dataset? For Iterable 1.0 is supported and for inf dataloader 0.0 and int is supported?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify. what would that look like? I don't know of any inf datasets that are not of type IterableDataset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither do I, never seen one. To he honest, I didn't even know about IterableDataset before this point 😅. Well I guess, I'll revert recent commits and change the exception message as suggested by Ethan.

@codecov
Copy link

codecov bot commented Aug 1, 2020

Codecov Report

Merging #2787 into master will increase coverage by 4%.
The diff coverage is 85%.

@@           Coverage Diff           @@
##           master   #2787    +/-   ##
=======================================
+ Coverage      86%     90%    +4%     
=======================================
  Files          78      78            
  Lines        7001    7001            
=======================================
+ Hits         6033    6317   +284     
+ Misses        968     684   -284     

@pep8speaks
Copy link

pep8speaks commented Aug 2, 2020

Hello @rohitgr7! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-08-05 16:40:43 UTC

@rohitgr7 rohitgr7 changed the title [WIP] Support limit_mode_batches (int) for infinite dataloader Support limit_mode_batches (int) for infinite dataloader Aug 3, 2020
@mergify
Copy link
Contributor

mergify bot commented Aug 4, 2020

This pull request is now in conflict... :(

@Borda Borda added allowed_pre_1.0 feature Is an improvement or enhancement labels Aug 4, 2020
@rohitgr7
Copy link
Contributor Author

rohitgr7 commented Aug 4, 2020

Should I add supporting docs here? https://pytorch-lightning.readthedocs.io/en/stable/sequences.html?highlight=IterableDataset#iterable-datasets

Copy link
Member

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, also the tests 🚀

pytorch_lightning/trainer/data_loading.py Outdated Show resolved Hide resolved
tests/trainer/test_dataloaders.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@mergify mergify bot requested a review from a team August 5, 2020 06:01
@mergify mergify bot requested a review from a team August 5, 2020 08:13
@mergify
Copy link
Contributor

mergify bot commented Aug 5, 2020

This pull request is now in conflict... :(

Copy link
Member

@ethanwharris ethanwharris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, looks good to me :) can clean up the last elif slightly

pytorch_lightning/trainer/data_loading.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/data_loading.py Outdated Show resolved Hide resolved
@mergify
Copy link
Contributor

mergify bot commented Aug 5, 2020

Great job! =)

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
docs/source/sequences.rst Outdated Show resolved Hide resolved
pytorch_lightning/core/lightning.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/data_loading.py Outdated Show resolved Hide resolved
if isinstance(self.limit_train_batches, int) or self.limit_train_batches == 0.0:
self.num_training_batches = min(self.num_training_batches, int(self.limit_train_batches))
elif self.num_training_batches != float('inf'):
self.num_training_batches = int(self.num_training_batches * self.limit_train_batches)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be also 5.1 meaning 510% ?
cc: @PyTorchLightning/core-contributors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Borda I don't think so. We iterate like this:

# run epoch
        for batch_idx, (batch, is_last_batch) in self.profiler.profile_iterable(
                enumerate(_with_is_last(train_dataloader)), "get_train_batch"
        ):
            # stop epoch if we limited the number of training batches
            if batch_idx >= self.num_training_batches:
                break

so if the loader is exhausted before, this would trigger an stop iteration meaning that the condition will never be True.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still, it would be cleaner to have there val = min(val, 1.0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do _check_batch_limits there to avoid floating values > 1.0

@mergify mergify bot merged commit de9c9f0 into master Aug 5, 2020
@mergify
Copy link
Contributor

mergify bot commented Aug 5, 2020

Great job! =)

@rohitgr7
Copy link
Contributor Author

rohitgr7 commented Aug 5, 2020

@Borda Borda deleted the fix_limit_batches_infdl branch August 5, 2020 19:47
@Borda
Copy link
Member

Borda commented Aug 5, 2020

@Borda can we unmerge it there is some irrelevant code I just deleted locally and need to push it? or should I create a new pr with updated changes?

yes, create a new PR and refer that it it fix to this one, lest get it done asap

williamFalcon added a commit that referenced this pull request Aug 5, 2020
williamFalcon added a commit that referenced this pull request Aug 5, 2020
)

* Revert "Support limit_mode_batches (int) for infinite dataloader (#2787)"

This reverts commit de9c9f0.

* Update training_tricks.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integer limit_val_batches does not work with infinite DataLoader.
6 participants