-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use torch.nn.utils.clip_grad_norm_
and add clip_grad_by_value
support for TPU
#7025
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7025 +/- ##
======================================
- Coverage 92% 92% -0%
======================================
Files 200 200
Lines 12992 12970 -22
======================================
- Hits 11918 11891 -27
- Misses 1074 1079 +5 |
torch.nn.utils.clip_grad_norm_
torch.nn.utils.clip_grad_norm_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a PR that did the exact opposite: #963 |
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
We apply the torch-xla patch for TPU, no need to duplicate the code ourselves Props for the ancient PR reference though |
torch.nn.utils.clip_grad_norm_
torch.nn.utils.clip_grad_norm_
and add clip_grad_value
support for TPU
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
torch.nn.utils.clip_grad_norm_
and add clip_grad_value
support for TPUtorch.nn.utils.clip_grad_norm_
and add clip_grad_by_value
support for TPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
What does this PR do?
See title
EPSILON
is removed. PyTorch does not have it as an argument. It should not be necessary as gradients are unscaled before clipping. I'm not sure whyMixedPrecisionPlugin
hadEPSILON=1e-5
instead of1e-6
. Maybe somebody has more info about this.clip_grad_value_
on TPURelated question on torch-xla: pytorch/xla#2884
Before submitting
PR review