Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting per-layer learning rates" #1612

Merged
merged 20 commits into from
Nov 13, 2023

Conversation

shaydeci
Copy link
Collaborator

@shaydeci shaydeci commented Nov 6, 2023

  • Added support for passing initial_lr as a dictionary.
  • Removed the usages of update_param_groups and initialize_param_groups. Affected recipes had equivalent initial_lr mapping added to them (tested).
  • For the edge cases of having an instantiated optimiser, I assign names to the parameter_groups and extrac an initial_lr mapping from them so our schedulers can be used.

Copy link
Collaborator

@BloodAxe BloodAxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went pretty brutal on this PR, sorry :)
I mean, it's good. It written well, it introduce new features while keeping support of existing logic, and it's clean. Just a few tricky things here and there.

@BloodAxe
Copy link
Collaborator

BloodAxe commented Nov 8, 2023

And cherry on top - let's rename the PR title to be "Github release notes"-friendly. Something like: "Allow setting per-layer learning rates"

@shaydeci shaydeci changed the title Feature/sg 1209 introduce initial_lr as mapping Allow setting per-layer learning rates" Nov 9, 2023
@BloodAxe
Copy link
Collaborator

BloodAxe commented Nov 9, 2023

One last remark from me - please add integration test for any model of your choice where we train a model for one short epoch with backbone lr=0, nech=0.1xLR, head=LR to ensure all works end-to-end
And then - LGTM :)

@shaydeci
Copy link
Collaborator Author

One last remark from me - please add integration test for any model of your choice where we train a model for one short epoch with backbone lr=0, nech=0.1xLR, head=LR to ensure all works end-to-end
And then - LGTM :)

Adde unit test training with frozen part of the net.

@shaydeci shaydeci merged commit f8686cd into master Nov 13, 2023
6 of 7 checks passed
@shaydeci shaydeci deleted the feature/SG-1209_introduce_optimizer_initializer branch November 13, 2023 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants