fix get dataloader size #2375

Borda · 2020-06-26T16:48:45Z

What does this PR do?

smhow the master is still failing

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

awaelchli · 2020-06-26T16:59:08Z

pytorch_lightning/trainer/data_loading.py

+                try:
+                    num_batches = len(dataloader)
+                except (TypeError, NotImplementedError):
+                    num_batches = float('inf')


Suggested change

try:

num_batches = len(dataloader)

except (TypeError, NotImplementedError):

num_batches = float('inf')

num_batches = len(dataloader) if callable(getattr(dataloader, "__len__", None)) else float("inf")

Wouldn't that be better?

I think it is bad to use exceptions for control flow like this. If the user implements len but has a type error, we catch it and the program crashes somewhere else, making it much harder for the user to debug their dataset. (#2266 )

seems like the suggestion did not work, so reverting it back... :(

what was the error? (we have a _has_len defined in the same file btw)

what about num_batches = len(dataloader) if _has_len(dataloader) else float('inf')?

This seems to be a longterm solution.

@awaelchli see here - https://github.com/PyTorchLightning/pytorch-lightning/runs/812000184

the problem is raising an exception in the __len__

yeah my orignal suggestions was wrong sorry. my new suggestion is to use the already existing _has_len method
num_batches = len(dataloader) if _has_len(dataloader) else float('inf')
but not a big deal, would just be cleaner.

awaelchli · 2020-06-26T16:59:38Z

tests/trainer/test_lr_finder.py

@@ -155,7 +154,7 @@ def test_accumulation_and_early_stopping(tmpdir):
        'Learning rate was not altered after running learning rate finder'
    assert len(lrfinder.results['lr']) == 100, \
        'Early stopping for learning rate finder did not work'
-    assert lrfinder._total_batch_idx == 100 * 2, \
+    assert lrfinder._total_batch_idx == 190, \


how is this related?

not really but the test was skipped completely

thschaaf

Looks good to me.

* Start accumulate gradients schedule at epoch 0 * Undo change in #2375 * Update test_trainer.py::test_gradient_accumulation_scheduling * Fix pep8 formatting * Remove 'Datasets/' folder * Split args for readability * Fix pep8 formatting

get dataloader size

64d4945

Borda added bug Something isn't working priority: 0 High priority task labels Jun 26, 2020

mergify bot requested a review from a team June 26, 2020 16:49

Borda requested review from awaelchli, jeremyjordan, SkafteNicki and williamFalcon June 26, 2020 16:50

Borda mentioned this pull request Jun 26, 2020

repair CI for Win #2358

Merged

7 tasks

awaelchli reviewed Jun 26, 2020

View reviewed changes

mergify bot requested a review from a team June 26, 2020 17:00

pyright

9412f7e

Borda mentioned this pull request Jun 26, 2020

Bugfix/_has_len #2307

Merged

7 tasks

Borda force-pushed the bugfix/dataloaders branch from dbe540a to 9412f7e Compare June 26, 2020 17:22

Borda requested a review from awaelchli June 26, 2020 19:08

thschaaf approved these changes Jun 26, 2020

View reviewed changes

awaelchli approved these changes Jun 26, 2020

View reviewed changes

mergify bot requested a review from a team June 26, 2020 19:18

Borda added the ready PRs ready to be merged label Jun 26, 2020

williamFalcon merged commit a5f4578 into master Jun 26, 2020

Borda deleted the bugfix/dataloaders branch June 26, 2020 19:44

HHousen mentioned this pull request Jul 3, 2020

Start accumulate gradients schedule at epoch 0 #2490

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix get dataloader size #2375

fix get dataloader size #2375

Borda commented Jun 26, 2020 •

edited

Loading

awaelchli Jun 26, 2020 •

edited

Loading

awaelchli Jun 26, 2020 •

edited

Loading

Borda Jun 26, 2020

awaelchli Jun 26, 2020

awaelchli Jun 26, 2020 •

edited

Loading

awaelchli Jun 26, 2020

thschaaf Jun 26, 2020

Borda Jun 26, 2020

Borda Jun 26, 2020

awaelchli Jun 26, 2020

awaelchli Jun 26, 2020

Borda Jun 26, 2020

thschaaf left a comment

fix get dataloader size #2375

fix get dataloader size #2375

Conversation

Borda commented Jun 26, 2020 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

awaelchli Jun 26, 2020 • edited Loading

Choose a reason for hiding this comment

awaelchli Jun 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awaelchli Jun 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thschaaf left a comment

Choose a reason for hiding this comment

Borda commented Jun 26, 2020 •

edited

Loading

awaelchli Jun 26, 2020 •

edited

Loading

awaelchli Jun 26, 2020 •

edited

Loading

awaelchli Jun 26, 2020 •

edited

Loading