Fix num_sanity_val_steps is clipped to limit_val_batches #2917

rohitgr7 · 2020-08-11T17:36:50Z

What does this PR do?

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

rohitgr7 · 2020-08-11T17:39:28Z

Just want to confirm should num_sanity_val_steps==-1 depend on limit_val_batches? I suggest it should but just wanted to confirm before making changes. Will fix and add new tests accordingly if required :)

williamFalcon · 2020-08-12T12:04:57Z

num_sanity_val_steps should be int or -1.

this is really meant to be a sanity check... floats will break this.

You need this min because if your dataset is smaller than the limit_val_batches it will crash

rohitgr7 · 2020-08-12T16:58:31Z

Not talking about changing num_sanity_val_steps to be a float or something. Let me give an example:

# val_dataloader_len = 20
Trainer(num_sanity_val_steps=-1, limit_val_batches=0.1)

on master, it runs for 20 batches but I suggest it should run for 2 only and limit_val_batches should be considered when num_sanity_val_steps=-1

Also with limit_val_batches=float doesn't work with num_sanity_val_steps != -1 on master.

awaelchli · 2020-08-14T08:15:28Z

pytorch_lightning/trainer/trainer.py

        using_val_step = ref_model.val_dataloader is not None and self.is_overridden('validation_step')
        should_sanity_check = using_val_step and self.num_sanity_val_steps > 0 and self.limit_val_batches > 0

        # run tiny validation (if validation defined)
        # to make sure program won't crash during val
        if should_sanity_check:
            self.reset_val_dataloader(ref_model)
+            self._num_sanity_val_steps = [min(self.num_sanity_val_steps, val_batches)


since we have already fields for num_training_batches etc, should we call this simply num_sanity_val_batches for consistency?
I guess the reason why you made it protected is because you want to differentiate between the user input arg and the internal list.

yeah that is one reason, the other one is when using lr_find or scale_batch_size or any other case when trainer.fit is called more than once. In such cases I don't think changing the init parameters itself is a good idea, tests will fail too, that's why I used _num_sanity_val_steps.

Actually there are two ways to handle this:

create _num_sanity_val_steps once and use it again while initializing progressbar.total in the other PR.

or do this num_batches = [min(self.num_sanity_val_steps, val_batches) for val_batches in self.num_val_batches] again while calculating progressbar.total and avoid _num_sanity_val_steps.

I prefer 1. since this is consistent with the other totals we compute internally. If possible, I would also prefer if we named this internal variable num_sanity_val_batches, also for consistency with the other totals. no strong preference though, the important part is that the implementation is clean fulfills our needs :)

ping me if you need more help with this pr.

@awaelchli yeah was thinking to change it to num_sanity_val_batches. I just need clarifications for #2917 (comment).

pep8speaks · 2020-08-19T16:56:24Z

Hello @rohitgr7! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-08-20 23:06:17 UTC

awaelchli · 2020-08-19T16:58:18Z

@rohitgr7 I fixed one test. Let's try to finish this and unblock the other PR. If @williamFalcon disagrees we can always change it. Anyway, the important part I guess is that we can internally have the list and this way easily support the counts for multiple dataloaders and access them elsewhere.

codecov · 2020-08-19T17:32:40Z

Codecov Report

Merging #2917 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #2917   +/-   ##
======================================
  Coverage      90%     90%           
======================================
  Files          81      81           
  Lines        7734    7734           
======================================
  Hits         6980    6980           
  Misses        754     754

awaelchli

👍

mergify · 2020-08-19T20:42:47Z

This pull request is now in conflict... :(

mergify · 2020-08-19T23:05:51Z

This pull request is now in conflict... :(

SkafteNicki

LGTM

mergify · 2020-08-20T22:59:45Z

This pull request is now in conflict... :(

rohitgr7 · 2020-08-21T17:25:45Z

@Borda can we merge this?

Borda · 2020-08-21T18:06:31Z

@Borda can we merge this?

I have not checked it, but it seems yo have 3 approves so yes...

mergify bot requested a review from a team August 11, 2020 17:37

Borda added the bug Something isn't working label Aug 11, 2020

awaelchli reviewed Aug 14, 2020

View reviewed changes

mergify bot requested a review from a team August 14, 2020 08:16

rohitgr7 and others added 7 commits August 20, 2020 00:11

Fix num_sanity_val_steps according to limit_val_steps

827dcb8

fix test

0aa6dfd

add num_sanity_batches

cc920ae

pep

0eaaa0b

update docstring in test

52f8e03

add more test

f1185dd

chlog

f006ebf

rohitgr7 force-pushed the fix_val_sanity_check branch from 4bee960 to f006ebf Compare August 19, 2020 18:44

rohitgr7 changed the title ~~[WIP] Fix num_sanity_val_steps according to limit_val_batches~~ Fix num_sanity_val_steps is clipped to limit_val_batches Aug 19, 2020

rohitgr7 requested review from awaelchli, SkafteNicki and Borda August 19, 2020 18:52

awaelchli requested a review from williamFalcon August 19, 2020 19:45

update comments and docstring in test

3e22bb8

awaelchli approved these changes Aug 19, 2020

View reviewed changes

mergify bot requested a review from a team August 19, 2020 19:57

Merge branch 'master' into fix_val_sanity_check

29254a8

ananyahjha93 approved these changes Aug 19, 2020

View reviewed changes

mergify bot requested a review from a team August 19, 2020 22:36

Merge branch 'master' into fix_val_sanity_check

afa72f4

SkafteNicki approved these changes Aug 20, 2020

View reviewed changes

Borda added this to the 0.9.x milestone Aug 20, 2020

Merge branch 'master' into fix_val_sanity_check

d956248

Borda merged commit 7cca385 into master Aug 21, 2020

Borda deleted the fix_val_sanity_check branch August 21, 2020 18:11

This was referenced Aug 22, 2020

[WIP] Fix the progress bar for the sanity check #2892

Closed

The total number of batches shows by the progress bar of the sanity check is wrong #2891

Closed

Runtime Error if validation_step is defined, but valid_loader isn't provided to Trainer #3052

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix num_sanity_val_steps is clipped to limit_val_batches #2917

Fix num_sanity_val_steps is clipped to limit_val_batches #2917

rohitgr7 commented Aug 11, 2020 •

edited

Loading

rohitgr7 commented Aug 11, 2020 •

edited

Loading

williamFalcon commented Aug 12, 2020

rohitgr7 commented Aug 12, 2020

awaelchli Aug 14, 2020 •

edited

Loading

rohitgr7 Aug 14, 2020

awaelchli Aug 16, 2020

awaelchli Aug 16, 2020

rohitgr7 Aug 16, 2020 •

edited

Loading

pep8speaks commented Aug 19, 2020 •

edited

Loading

awaelchli commented Aug 19, 2020

codecov bot commented Aug 19, 2020 •

edited

Loading

awaelchli left a comment

mergify bot commented Aug 19, 2020

mergify bot commented Aug 19, 2020

SkafteNicki left a comment

mergify bot commented Aug 20, 2020

rohitgr7 commented Aug 21, 2020

Borda commented Aug 21, 2020

Fix num_sanity_val_steps is clipped to limit_val_batches #2917

Fix num_sanity_val_steps is clipped to limit_val_batches #2917

Conversation

rohitgr7 commented Aug 11, 2020 • edited Loading

What does this PR do?

Before submitting

PR review

Did you have fun?

rohitgr7 commented Aug 11, 2020 • edited Loading

williamFalcon commented Aug 12, 2020

rohitgr7 commented Aug 12, 2020

awaelchli Aug 14, 2020 • edited Loading

Choose a reason for hiding this comment

rohitgr7 Aug 14, 2020

Choose a reason for hiding this comment

awaelchli Aug 16, 2020

Choose a reason for hiding this comment

awaelchli Aug 16, 2020

Choose a reason for hiding this comment

rohitgr7 Aug 16, 2020 • edited Loading

Choose a reason for hiding this comment

pep8speaks commented Aug 19, 2020 • edited Loading

Comment last updated at 2020-08-20 23:06:17 UTC

awaelchli commented Aug 19, 2020

codecov bot commented Aug 19, 2020 • edited Loading

Codecov Report

awaelchli left a comment

Choose a reason for hiding this comment

mergify bot commented Aug 19, 2020

mergify bot commented Aug 19, 2020

SkafteNicki left a comment

Choose a reason for hiding this comment

mergify bot commented Aug 20, 2020

rohitgr7 commented Aug 21, 2020

Borda commented Aug 21, 2020

rohitgr7 commented Aug 11, 2020 •

edited

Loading

rohitgr7 commented Aug 11, 2020 •

edited

Loading

awaelchli Aug 14, 2020 •

edited

Loading

rohitgr7 Aug 16, 2020 •

edited

Loading

pep8speaks commented Aug 19, 2020 •

edited

Loading

codecov bot commented Aug 19, 2020 •

edited

Loading