Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gradient accumulation option for bicleaner-ai training #27

Open
radinplaid opened this issue Oct 10, 2023 · 1 comment
Open

Add gradient accumulation option for bicleaner-ai training #27

radinplaid opened this issue Oct 10, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@radinplaid
Copy link

The wiki suggests a batch size of 128 is recommended for 'stable training'.

It would be helpful to have the option to accumulate gradients so that bicleaner-ai training with larger "effective batch size" were possible on GPUs with a relatively small amount of RAM.

Fairseq calls this option "--update-freq"
Sockeye calls this option "--update-interval"

@ZJaume
Copy link
Member

ZJaume commented Nov 6, 2023

Hi @radinplaid, I agree and I've been thinking of it since I did the tool. Unfortunately Tensorflow does not support it natively, so it would require us to replace the tensorflow training loop function with our handmade function. Maybe at some point I'll will have time to implement it. I'm gladly to accept PRs if someone wants to write it.

@ZJaume ZJaume added the enhancement New feature or request label Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants