Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library Request: mamba #7629

Closed
4 tasks done
AlexCatarino opened this issue Dec 14, 2023 · 0 comments · Fixed by #7856
Closed
4 tasks done

Library Request: mamba #7629

AlexCatarino opened this issue Dec 14, 2023 · 0 comments · Fixed by #7856

Comments

@AlexCatarino
Copy link
Member

Expected Behavior

Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and implementation in the spirit of FlashAttention.

Checklist

  • I have completely filled out this template
  • I have confirmed that this issue exists on the current master branch
  • I have confirmed that this is not a duplicate issue by searching issues
  • I have provided detailed steps to reproduce the issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant