Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added torch_dmoe defaults, bug fixes for 2D inputs #1210

Merged
merged 31 commits into from
May 15, 2024

Conversation

snarayan21
Copy link
Contributor

@snarayan21 snarayan21 commented May 15, 2024

Users are hitting issues where changing their ffn_type from mb_dmoe to torch_dmoe results in errors with their config. This is because torch_dmoe does not have any defaults set. This PR ensures that torch_dmoe has the same defaults as mb_dmoe, according to the Arguments dataclass in the Megablocks repo. Fixed typing in some places as well. Added a test to make sure that mb_dmoe and torch_dmoe are the same in fwd and bwd for these defauly values.

There was also a bug for some inputs (top k = 1) where the outputs of the max function were 1 dimensional instead of 2. This bug has also been addressed.

There's a separate bug with Megablocks where, if both the hidden_size and ffn_hidden_size are both 128, there's an unintended bug -- this is why hidden_size has been bumped to 256. With values of hidden_size and ffn_hidden_size not divisible by 128, there's an unhelpful error thrown. Will open a separate PR on megablocks side to address this.

@snarayan21 snarayan21 requested a review from dakinggg May 15, 2024 21:09
@snarayan21 snarayan21 requested review from mvpatel2000, dakinggg, b-chu and irenedea and removed request for dakinggg May 15, 2024 21:09
@dakinggg dakinggg requested a review from bigning May 15, 2024 21:12
llmfoundry/models/layers/dmoe.py Show resolved Hide resolved
llmfoundry/models/layers/dmoe.py Outdated Show resolved Hide resolved
@mvpatel2000
Copy link
Collaborator

LGTM but want @bigning 's signoff

Copy link
Contributor

@bigning bigning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@snarayan21 snarayan21 merged commit b414626 into main May 15, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants