-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clip_gradient with clip_grad_value #5460
Comments
Moved the following comment from #5456 since reopening issue was disabled :-) Hi @tchaton, There are two popular gradient clipping methods: one that limits the maximum gradient value of each model parameter and the other one that scales the gradient value based on the p-norm of a (sub-)set of model parameters. PyTorch Lightning implements the second option which can be used with Trainer's gradient_clip_val parameter as you mentioned. By the way, I don't think that this functionality is something that can break BC. |
Hey @priancho @dhkim0225 , Yes, I understand now. Sounds like a great idea ! Would @priancho or @dhkim0225 want to make a PR for such feature ? Best, |
@tchaton Sorry for bothering you. This is the first time to contribute a huge open-source project. |
🚀 Feature
Same issue with #4927 #5456
The current clip_gradient uses clip_grad_norm; can we add clip_grad_value?
https://github.com/PyTorchLightning/pytorch-lightning/blob/f2e99d617f05ec65fded81ccc6d0d59807c47573/pytorch_lightning/plugins/native_amp.py#L63-L65
============================================================
@tchaton
As far as I know, there is a difference between clip_grad_by_value and clip_grad_by_norm.
All of the implementations in PL only use
clip_grad_by_norm
.clip_grad_by_value
does not perform clipping withnorm value
but just performs clipping by value, so it is useful when learning model with noisy data.Please let me know if you think I'm wrong.
pytorch clip by norm link
pytorch clip by value link
Sincerely,
Anthony Kim.
The text was updated successfully, but these errors were encountered: