Use `torch.nn.utils.clip_grad_norm_` and add `clip_grad_by_value` support for TPU #7025

carmocca · 2021-04-14T22:44:29Z

What does this PR do?

See title

EPSILON is removed. PyTorch does not have it as an argument. It should not be necessary as gradients are unscaled before clipping. I'm not sure why MixedPrecisionPlugin had EPSILON=1e-5 instead of 1e-6. Maybe somebody has more info about this.
improves docs to clarify link to mixed precision.
Adds support for clip_grad_value_ on TPU

Related question on torch-xla: pytorch/xla#2884

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

pytorch_lightning/accelerators/tpu.py

codecov · 2021-04-19T10:58:17Z

Codecov Report

Merging #7025 (a0a6ad9) into master (9ba76ce) will decrease coverage by 0%.
The diff coverage is 67%.

@@          Coverage Diff           @@
##           master   #7025   +/-   ##
======================================
- Coverage      92%     92%   -0%     
======================================
  Files         200     200           
  Lines       12992   12970   -22     
======================================
- Hits        11918   11891   -27     
- Misses       1074    1079    +5

kaushikb11

LGTM!

docs/source/advanced/training_tricks.rst

tests/models/test_tpu.py

awaelchli · 2021-04-19T16:42:51Z

There was a PR that did the exact opposite: #963

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

carmocca · 2021-04-19T16:46:13Z

There was a PR that did the exact opposite: #963

We apply the torch-xla patch for TPU, no need to duplicate the code ourselves

Props for the ancient PR reference though

CHANGELOG.md

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

tchaton

LGTM !

CHANGELOG.md

Use torch.nn.utils.clip_grad_norm_

f3cbf13

carmocca added the refactor label Apr 14, 2021

carmocca self-assigned this Apr 14, 2021

carmocca added 6 commits April 15, 2021 00:47

flake8

486eed7

Improve docs

7b57d22

Merge branch 'master' into use-torch-clip-grad-norm

75e0b7d

clip_grad_value for TPU

e10e9b3

Minor fix

2b7f32d

Change mock

9f978cf

kaushikb11 reviewed Apr 19, 2021

View reviewed changes

pytorch_lightning/accelerators/tpu.py Show resolved Hide resolved

carmocca changed the title ~~[WIP] Use torch.nn.utils.clip_grad_norm_~~ Use torch.nn.utils.clip_grad_norm_ Apr 19, 2021

carmocca added feature Is an improvement or enhancement accelerator: tpu Tensor Processing Unit labels Apr 19, 2021

carmocca added this to the v1.4 milestone Apr 19, 2021

carmocca added 3 commits April 19, 2021 18:21

Merge branch 'master' into use-torch-clip-grad-norm

ee9fa35

Update CHANGELOG

0d28b44

Update CHANGELOG

b4314d1

carmocca marked this pull request as ready for review April 19, 2021 16:25

carmocca requested review from awaelchli, Borda, edenlightning, justusschock, SeanNaren, tchaton and williamFalcon as code owners April 19, 2021 16:26

carmocca mentioned this pull request Apr 19, 2021

Add clip_grad_by_value for TPUs #7095

Closed

11 tasks

kaushikb11 approved these changes Apr 19, 2021

View reviewed changes

awaelchli reviewed Apr 19, 2021

View reviewed changes

docs/source/advanced/training_tricks.rst Outdated Show resolved Hide resolved

tests/models/test_tpu.py Show resolved Hide resolved

Update docs/source/advanced/training_tricks.rst

1971964

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

carmocca changed the title ~~Use torch.nn.utils.clip_grad_norm_~~ Use torch.nn.utils.clip_grad_norm_ and add clip_grad_value support for TPU Apr 19, 2021

kaushikb11 reviewed Apr 19, 2021

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Update CHANGELOG.md

1c5bf86

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

carmocca changed the title ~~Use torch.nn.utils.clip_grad_norm_ and add clip_grad_value support for TPU~~ Use torch.nn.utils.clip_grad_norm_ and add clip_grad_by_value support for TPU Apr 19, 2021

awaelchli approved these changes Apr 20, 2021

View reviewed changes

mergify bot added the has conflicts label Apr 20, 2021

Merge branch 'master' into use-torch-clip-grad-norm

008cca8

mergify bot removed the has conflicts label Apr 20, 2021

ananthsub approved these changes Apr 20, 2021

View reviewed changes

mergify bot added the has conflicts label Apr 27, 2021

tchaton approved these changes May 4, 2021

View reviewed changes

Merge branch 'master' into use-torch-clip-grad-norm

341b7d1

mergify bot removed the has conflicts label May 4, 2021

carmocca added 2 commits May 4, 2021 12:20

Bad merge

53d58c4

Bad merge

c74609a

mergify bot added the has conflicts label May 6, 2021

Merge branch 'master' into use-torch-clip-grad-norm

bd8a75f

mergify bot removed the has conflicts label May 7, 2021

carmocca commented May 7, 2021

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

Apply suggestions from code review

483d637

carmocca commented May 7, 2021

View reviewed changes

CHANGELOG.md Show resolved Hide resolved

Update CHANGELOG.md

a0a6ad9

carmocca enabled auto-merge (squash) May 7, 2021 15:57

carmocca merged commit 8208c33 into master May 7, 2021

carmocca deleted the use-torch-clip-grad-norm branch May 7, 2021 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `torch.nn.utils.clip_grad_norm_` and add `clip_grad_by_value` support for TPU #7025

Use `torch.nn.utils.clip_grad_norm_` and add `clip_grad_by_value` support for TPU #7025

carmocca commented Apr 14, 2021 •

edited

Loading

codecov bot commented Apr 19, 2021 •

edited

Loading

kaushikb11 left a comment

awaelchli commented Apr 19, 2021

carmocca commented Apr 19, 2021

tchaton left a comment

Use torch.nn.utils.clip_grad_norm_ and add clip_grad_by_value support for TPU #7025

Use torch.nn.utils.clip_grad_norm_ and add clip_grad_by_value support for TPU #7025

Conversation

carmocca commented Apr 14, 2021 • edited Loading

What does this PR do?

Before submitting

PR review

codecov bot commented Apr 19, 2021 • edited Loading

Codecov Report

kaushikb11 left a comment

Choose a reason for hiding this comment

awaelchli commented Apr 19, 2021

carmocca commented Apr 19, 2021

tchaton left a comment

Choose a reason for hiding this comment

Use `torch.nn.utils.clip_grad_norm_` and add `clip_grad_by_value` support for TPU #7025

Use `torch.nn.utils.clip_grad_norm_` and add `clip_grad_by_value` support for TPU #7025

carmocca commented Apr 14, 2021 •

edited

Loading

codecov bot commented Apr 19, 2021 •

edited

Loading