Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multipack w batch sampler #795

Merged
merged 12 commits into from
Nov 8, 2023
Merged

multipack w batch sampler #795

merged 12 commits into from
Nov 8, 2023

Conversation

winglian
Copy link
Collaborator

@winglian winglian commented Oct 27, 2023

resolves #406

@winglian winglian added enhancement New feature or request wip labels Oct 27, 2023
@winglian winglian marked this pull request as draft October 27, 2023 22:26
@winglian winglian marked this pull request as ready for review October 28, 2023 01:29
@winglian winglian removed the wip label Oct 28, 2023
@casper-hansen
Copy link
Collaborator

Could this and should this replace the DataLoader? The sampler looks like a simple version of the Multipack DataLoader.

@winglian
Copy link
Collaborator Author

Yeah. I want to rip out the previous dataloader. Yeah, it's basically the sampler but adapted to subclass the batch sampler and the various fixes needed to use uneven batches

@casper-hansen
Copy link
Collaborator

So Multipack can be implemented as a sampler and then we can use the standard DataLoader from PyTorch? Probably good to make the full adjustment in this PR if we go that way

@winglian
Copy link
Collaborator Author

winglian commented Oct 28, 2023

  • eval sample packing issues
  • 32k context issues
  • remove old dataloader
  • adamw_bnb_8bit optimizer issue (??)

@winglian winglian merged commit 641e6f7 into main Nov 8, 2023
4 checks passed
@winglian winglian deleted the batch-sampler-test branch November 8, 2023 01:27
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
* test batch sampler w varying batch lens

* wip

* multipack batchsampler wip

* wip

* fix for prepare data loader to get correct # of steps based on gpues

* lint and clean up

* calculate len estimate

* fix total num steps calc

* add options for dataloader_num_workers and dataloader_pin_memory

* remove gitbook

* support prefetch_factor for dataloader optimization

* fix the kwarg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sample packing with resume_from_checkpoint
2 participants