Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix warmup accumulate #3722

Merged
merged 6 commits into from
Jun 28, 2021
Merged

Commits on Jun 28, 2021

  1. gradient accumulation during warmup in train.py

    Context:
    `accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step().
    During warmup, it is ramped up from 1 to the final value nbs / batch_size. 
    Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps.
    
    The bug:
    The condition to perform the opt step is wrong
    > if ni % accumulate == 0:
    This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup.
    
    This demo also shows the proposed solution, to use a ">=" condition instead:
    https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing
    
    Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.
    yellowdolphin authored and glenn-jocher committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    a93e484 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ce33f84 View commit details
    Browse the repository at this point in the history
  3. add docstrings

    yellowdolphin authored and glenn-jocher committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    b7c489f View commit details
    Browse the repository at this point in the history
  4. move down nw

    yellowdolphin authored and glenn-jocher committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    dd5a574 View commit details
    Browse the repository at this point in the history
  5. Update train.py

    glenn-jocher committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    53aae7f View commit details
    Browse the repository at this point in the history
  6. revert math import move

    glenn-jocher committed Jun 28, 2021
    Configuration menu
    Copy the full SHA
    9ac476e View commit details
    Browse the repository at this point in the history