Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace mamba1 with mamba2 and training becomes very slow! #510

Open
YQ-097 opened this issue Aug 1, 2024 · 2 comments
Open

Replace mamba1 with mamba2 and training becomes very slow! #510

YQ-097 opened this issue Aug 1, 2024 · 2 comments

Comments

@YQ-097
Copy link

YQ-097 commented Aug 1, 2024

@torch.compile(options={"triton.cudagraphs": True}, fullgraph=True) generates an error. Is there any other way?

@tridao
Copy link
Collaborator

tridao commented Aug 1, 2024

If you use a large model the triton overhead will be neglibile.

@dragonBrother1
Copy link

@torch.compile(options={“triton.cudagraphs”: True}, fullgraph=True) 生成错误。还有其他方法吗?

I encounted some questions when i chose to value mamba2 instead of mamba.
image
Dose it mean that i should vary the MambaConfig?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants