Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise LR schedule warnings only when necessary #3207

Merged
merged 4 commits into from
Apr 24, 2024

Conversation

snarayan21
Copy link
Contributor

@snarayan21 snarayan21 commented Apr 23, 2024

What does this PR do?

Simple bug fix -- we were raising warnings for LR schedulers when not intended because we were not exiting out of the function in valid cases. Fixes #3205

What issue(s) does this change relate to?

Before submitting

  • Have you read the contributor guidelines?
  • Is this change a documentation change or typo fix? If so, skip the rest of this checklist.
  • Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so.
  • Did you update any related docs and document your change?
  • Did you update any related tests and add any new tests related to your change? (see testing)
  • Did you run the tests locally to make sure they pass?
  • Did you run pre-commit on your change? (see the pre-commit section of prerequisites)

@snarayan21
Copy link
Contributor Author

whoops seems some other debugging stuff snuck in here. removing

Copy link
Contributor

@b-chu b-chu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@snarayan21 snarayan21 merged commit 615979e into dev Apr 24, 2024
15 checks passed
@snarayan21 snarayan21 deleted the saaketh/lr_scheduling_warnng branch April 24, 2024 01:48
Comment on lines +566 to +572
if t_max.unit != max_dur.unit:
# Units are not comparable, so we cannot check if t_max is valid. Log this and return.
log.debug(
f'Since max_duration {max_dur} with units {max_dur.unit} and t_max {t_max} with units {t_max.unit} are not '
'comparable, make sure that your LR schedule is defined at all points in the training duration.',
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this log statement can be very repetitive as it emanates from the __call__. Should this be part of the __init__ instead ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Antoine -- the problem described in the issue is less the warning level and more the frequency of logging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the log was still an error because the durations were equal through the dataloader length even the units were different. The current refactoring solve this with the early returns.

Now that I give it a second though, I'm not sure my proposal of handling it in the __init__ makes sense because the max_duration can change if the .fit() if called multiple times.

j316chuck pushed a commit that referenced this pull request May 16, 2024
* compare and exit

* compare and exit

* revert changes

* edge case ://
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Composer default lr scheduler create spurious warnings when max training duration is in epochs
4 participants