Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use torch.nn.utils.clip_grad_norm_ and add clip_grad_by_value support for TPU #7025

Merged
merged 19 commits into from
May 7, 2021

Conversation

carmocca
Copy link
Contributor

@carmocca carmocca commented Apr 14, 2021

What does this PR do?

See title

  • EPSILON is removed. PyTorch does not have it as an argument. It should not be necessary as gradients are unscaled before clipping. I'm not sure why MixedPrecisionPlugin had EPSILON=1e-5 instead of 1e-6. Maybe somebody has more info about this.
  • improves docs to clarify link to mixed precision.
  • Adds support for clip_grad_value_ on TPU

Related question on torch-xla: pytorch/xla#2884

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

@carmocca carmocca self-assigned this Apr 14, 2021
@codecov
Copy link

codecov bot commented Apr 19, 2021

Codecov Report

Merging #7025 (a0a6ad9) into master (9ba76ce) will decrease coverage by 0%.
The diff coverage is 67%.

@@          Coverage Diff           @@
##           master   #7025   +/-   ##
======================================
- Coverage      92%     92%   -0%     
======================================
  Files         200     200           
  Lines       12992   12970   -22     
======================================
- Hits        11918   11891   -27     
- Misses       1074    1079    +5     

@carmocca carmocca changed the title [WIP] Use torch.nn.utils.clip_grad_norm_ Use torch.nn.utils.clip_grad_norm_ Apr 19, 2021
@carmocca carmocca added feature Is an improvement or enhancement accelerator: tpu Tensor Processing Unit labels Apr 19, 2021
@carmocca carmocca added this to the v1.4 milestone Apr 19, 2021
Copy link
Contributor

@kaushikb11 kaushikb11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

docs/source/advanced/training_tricks.rst Outdated Show resolved Hide resolved
tests/models/test_tpu.py Show resolved Hide resolved
@awaelchli
Copy link
Member

There was a PR that did the exact opposite: #963

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
@carmocca
Copy link
Contributor Author

There was a PR that did the exact opposite: #963

We apply the torch-xla patch for TPU, no need to duplicate the code ourselves

Props for the ancient PR reference though

@carmocca carmocca changed the title Use torch.nn.utils.clip_grad_norm_ Use torch.nn.utils.clip_grad_norm_ and add clip_grad_value support for TPU Apr 19, 2021
CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
@carmocca carmocca changed the title Use torch.nn.utils.clip_grad_norm_ and add clip_grad_value support for TPU Use torch.nn.utils.clip_grad_norm_ and add clip_grad_by_value support for TPU Apr 19, 2021
@mergify mergify bot removed the has conflicts label Apr 20, 2021
Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@mergify mergify bot removed the has conflicts label May 4, 2021
@mergify mergify bot added the has conflicts label May 6, 2021
@mergify mergify bot removed the has conflicts label May 7, 2021
CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
@carmocca carmocca enabled auto-merge (squash) May 7, 2021 15:57
@carmocca carmocca merged commit 8208c33 into master May 7, 2021
@carmocca carmocca deleted the use-torch-clip-grad-norm branch May 7, 2021 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerator: tpu Tensor Processing Unit feature Is an improvement or enhancement refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants