Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling the computation of validation loss and other metrics when using sequence parallelism #3183

Merged
merged 13 commits into from
Apr 10, 2024

Conversation

ShashankMosaicML
Copy link
Contributor

@ShashankMosaicML ShashankMosaicML commented Apr 8, 2024

What does this PR do?

  1. Enables the computation of validation loss and metrics when using sequence parallelism.
  2. Fixes a bug where we were using the dataloader's batch size instead of device batch size in the error condition, which caues problems when using auto packing or any packing ratio > 1. Please also see fixing evaluator microbatch size llm-foundry#1100
  3. Also adds a new error message for using auto microbatching with sequence parallelism when creating an evaluator.
  4. Fixes is_sampler_distributed = dataloader.sampler and ... to is_sampler_distributed = (dataloader.sampler is not None) and ...

@ShashankMosaicML ShashankMosaicML marked this pull request as ready for review April 8, 2024 21:43
@ShashankMosaicML ShashankMosaicML changed the title Shashank/fix seq parallel eval Enabling the computation of validation loss and other metrics when using sequence parallelism Apr 8, 2024
@ShashankMosaicML ShashankMosaicML enabled auto-merge (squash) April 9, 2024 23:54
@ShashankMosaicML ShashankMosaicML merged commit 4e54004 into dev Apr 10, 2024
14 checks passed
@ShashankMosaicML ShashankMosaicML deleted the shashank/fix_seq_parallel_eval branch April 10, 2024 00:40
staghado pushed a commit to lightonai/composer that referenced this pull request Apr 13, 2024
…ing sequence parallelism (mosaicml#3183)

* fix a bug in eval with seq parallelism

* print debug values

* ..

* ..

* ..

* potentially fixing the eval bug

* minor

* minor

* minor

* ..

* fixing is_sampler_distributed

* removing redundant condition
staghado pushed a commit to lightonai/composer that referenced this pull request Apr 13, 2024
…ing sequence parallelism (mosaicml#3183)

* fix a bug in eval with seq parallelism

* print debug values

* ..

* ..

* ..

* potentially fixing the eval bug

* minor

* minor

* minor

* ..

* fixing is_sampler_distributed

* removing redundant condition
j316chuck pushed a commit that referenced this pull request May 16, 2024
…ing sequence parallelism (#3183)

* fix a bug in eval with seq parallelism

* print debug values

* ..

* ..

* ..

* potentially fixing the eval bug

* minor

* minor

* minor

* ..

* fixing is_sampler_distributed

* removing redundant condition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants