-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About torch.nn.utils clipping functions #2884
Comments
Thanks! We meant to test this out. I agree that |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
It would be nice to revisit this. In Lightning, we are seeing this error when E RuntimeError: The norm of order 2.0 for a gradient from `parameters` is non-finite, so it cannot be clipped. This error can be disabled with `error_if_nonfinite=False` For tests that run and pass normally if As a workaround, I'm doing: if hasattr(torch.nn.utils.clip_grad_norm_, "_orig"):
# hacky workaround to https://github.com/pytorch/xla/issues/2884: undo xla patching on import
torch.nn.utils.clip_grad_norm_ = torch.nn.utils.clip_grad_norm_._orig (in Lightning-AI/pytorch-lightning#17519) This global patching is particularly problematic because it is done regardless of whether you actually end up using XLA at all. |
❓ Questions and Help
Hello,
I noticed xla patches the
torch.nn.utils.clip_grad_norm_
function. Allegedly due to performance issues. From https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md:PyTorch's implementation of
clip_grad_norm_
was updated in pytorch/pytorch#32020 (which made it into 1.5.0) so the computation no longer relies on.item()
Does that mean xla's patch is no longer necessary on torch >= 1.5.0?
Additionally, are there any known issues for
torch.nn.utils.clip_grad_value_
? I am assuming that's not the case since there are no.item()
calls, but could not find any confirmation anywhere.Thanks!
The text was updated successfully, but these errors were encountered: