Trainer with grad accum #6930

sgugger · 2020-09-03T20:54:23Z

As mentioned on the forum, the behavior of Trainer can be confusing when using gradient accumulation as the count of steps becomes the count of update steps, not the count of training examples seen. This PR adds a warning in the doc.

codecov · 2020-09-03T21:00:52Z

Codecov Report

Merging #6930 into master will decrease coverage by 3.53%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6930      +/-   ##
==========================================
- Coverage   80.60%   77.07%   -3.54%     
==========================================
  Files         161      161              
  Lines       30119    30119              
==========================================
- Hits        24278    23214    -1064     
- Misses       5841     6905    +1064

Impacted Files	Coverage Δ
src/transformers/training_args.py	`91.66% <ø> (ø)`
src/transformers/training_args_tf.py	`47.45% <ø> (ø)`
src/transformers/configuration_lxmert.py	`20.00% <0.00%> (-80.00%)`	⬇️
src/transformers/modeling_tf_lxmert.py	`22.49% <0.00%> (-71.63%)`	⬇️
src/transformers/modeling_tf_albert.py	`21.47% <0.00%> (-69.44%)`	⬇️
src/transformers/modeling_lxmert.py	`23.50% <0.00%> (-67.27%)`	⬇️
src/transformers/tokenization_albert.py	`28.84% <0.00%> (-58.66%)`	⬇️
src/transformers/modeling_xlnet.py	`60.81% <0.00%> (-22.62%)`	⬇️
src/transformers/tokenization_transfo_xl.py	`20.53% <0.00%> (-21.21%)`	⬇️
src/transformers/modeling_transfo_xl_utilities.py	`52.98% <0.00%> (-13.44%)`	⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 207ed8c...68c12f3. Read the comment docs.

LysandreJik

LGTM, cool warning!

* Add warning for gradient accumulation * Formatting

This reverts commit 37b0947.

sgugger added 2 commits September 3, 2020 16:49

Add warning for gradient accumulation

fd8d5d6

Formatting

68c12f3

sgugger requested review from julien-c and LysandreJik September 3, 2020 20:54

LysandreJik approved these changes Sep 7, 2020

View reviewed changes

LysandreJik merged commit 08de989 into master Sep 7, 2020

LysandreJik deleted the trainer_with_grad_accum branch September 7, 2020 08:54

Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020

Trainer with grad accum (huggingface#6930)

077442b

* Add warning for gradient accumulation * Formatting

fabiocapsouza pushed a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Trainer with grad accum (huggingface#6930)

37b0947

* Add warning for gradient accumulation * Formatting

fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Revert "Trainer with grad accum (huggingface#6930)"

64d1c3d

This reverts commit 37b0947.

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer with grad accum #6930

Trainer with grad accum #6930

sgugger commented Sep 3, 2020

codecov bot commented Sep 3, 2020

LysandreJik left a comment

Trainer with grad accum #6930

Trainer with grad accum #6930

Conversation

sgugger commented Sep 3, 2020

codecov bot commented Sep 3, 2020

Codecov Report

LysandreJik left a comment

Choose a reason for hiding this comment