Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2bit 405b? #124

Open
ewof opened this issue Sep 18, 2024 · 3 comments
Open

2bit 405b? #124

ewof opened this issue Sep 18, 2024 · 3 comments

Comments

@ewof
Copy link

ewof commented Sep 18, 2024

would be cool

@Vahe1994
Copy link
Owner

Hi, @ewof!

Thank you for your suggestion. There are several technical difficulties to make it fit into the GPUs for quantization , but it is definitely possible. We are already working on this. Unfortunately, we are a bit short on manpower, so I'm not sure when or if this will happen.

@OpenSourceRonin
Copy link

Hi @ewof and @Vahe1994,

No offense intended. AQLM is a fantastic project, and VPTQ has acknowledged your work in its acknowledgments.

I've successfully reproduced the VPTQ method and released several models on Hugging Face, including the 405B LLaMA 3.1, 70B LLaMA 3.1, and 72B LLaMA 3.2.

I welcome discussion and testing—let's explore these together!

@Vahe1994
Copy link
Owner

Vahe1994 commented Oct 7, 2024

Hey @OpenSourceRonin,
Thank you for letting us know. We are all for open-source and making models available to people, regardless of which quantization was used. So, of course, no offense taken. You did a great job! Thanks you for your work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants