Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free outputs callback #2598

Merged
merged 10 commits into from
Oct 3, 2023

Conversation

mvpatel2000
Copy link
Contributor

@mvpatel2000 mvpatel2000 commented Oct 2, 2023

What does this PR do?

Adds callback to support freeing outputs for memory savings. When not using train_metrics, self.state.outputs are not needed. However, they may take up a non-trivial amount of memory (seq_length*vocab_size*microbatch_size*bytes_per_param). For certain long sequence models, this memory starts to matter (~1-2GB), so having an option to free the memory is useful.

Existing tests (eg TestCallbackTrains) should be sufficient for this PR.

image

What issue(s) does this change relate to?

GRT-2464

Copy link
Contributor

@b-chu b-chu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but wait for someone more familiar with callbacks to also approve

tests/callbacks/callback_settings.py Show resolved Hide resolved
@Skylion007
Copy link
Contributor

Nit: isn't free a bit of an overloaded term in ML? Or in general? What about "releaseOutputs"? That's a bit more of a C++ pointer terminology here but its a bit more exact.

Copy link
Contributor

@j316chuck j316chuck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM added some nits

composer/callbacks/free_outputs.py Outdated Show resolved Hide resolved
composer/callbacks/free_outputs.py Show resolved Hide resolved
composer/callbacks/generate.py Show resolved Hide resolved
Co-authored-by: Charles Tang <j316chuck@users.noreply.github.com>
@mvpatel2000 mvpatel2000 merged commit 9c0ba84 into mosaicml:dev Oct 3, 2023
18 checks passed
@mvpatel2000 mvpatel2000 deleted the mvpatel2000/free-train-metrics branch October 3, 2023 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants