Fix warmup `accumulate` #3722

Context: `accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step(). During warmup, it is ramped up from 1 to the final value nbs / batch_size. Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps. The bug: The condition to perform the opt step is wrong > if ni % accumulate == 0: This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup. This demo also shows the proposed solution, to use a ">=" condition instead: https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix warmup `accumulate` #3722

Fix warmup `accumulate` #3722

Commits on Jun 28, 2021

Fix warmup accumulate #3722

Fix warmup accumulate #3722

Commits on Jun 28, 2021

Fix warmup `accumulate` #3722

Fix warmup `accumulate` #3722