Fix global_step when gradient accumulation > 1 #832

peteriz · 2020-02-13T10:59:40Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Fix global_step update when gradient accumulation is > 1.
global_step will be updated only after all parts of the batch are done.
Fixes #831

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Borda · 2020-02-14T08:28:45Z

pytorch_lightning/trainer/training_loop.py

@@ -426,7 +426,9 @@ def run_training_epoch(self):
                # logs user requested information to logger
                self.log_metrics(batch_step_metrics, grad_norm_dic)

-            self.global_step += 1
+            # progress global step according to grads progress
+            if (self.batch_idx + 1) % self.accumulate_grad_batches == 0:


can we clarify what is the meaning of these two variables?
could you pls add comment to the trainer init starting with #: so it is generated also in the documentation...
Thx

Fix global_step when gradient accumulation > 1

4463de1

peteriz requested a review from Borda February 13, 2020 11:02

Borda reviewed Feb 14, 2020

View reviewed changes

Borda added the information needed label Feb 14, 2020

williamFalcon merged commit 27bba1a into Lightning-AI:master Feb 16, 2020

hanbyul-kim pushed a commit to hanbyul-kim/pytorch-lightning that referenced this pull request Feb 16, 2020

Fix global_step when gradient accumulation > 1 (Lightning-AI#832)

7dd94cb

ibeltagy mentioned this pull request Mar 3, 2020

Tensorboard logging should use num_grad_updates not batch_idx #835

Closed

ibeltagy mentioned this pull request May 20, 2020

TriviaQA LR scheduler code issue allenai/longformer#37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix global_step when gradient accumulation > 1 #832

Fix global_step when gradient accumulation > 1 #832

peteriz commented Feb 13, 2020

Borda Feb 14, 2020 •

edited by williamFalcon

Loading

Fix global_step when gradient accumulation > 1 #832

Fix global_step when gradient accumulation > 1 #832

Conversation

peteriz commented Feb 13, 2020

Before submitting

What does this PR do?

PR review

Borda Feb 14, 2020 • edited by williamFalcon Loading

Choose a reason for hiding this comment

Borda Feb 14, 2020 •

edited by williamFalcon

Loading