-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2bit 405b? #124
Comments
Hi, @ewof! Thank you for your suggestion. There are several technical difficulties to make it fit into the GPUs for quantization , but it is definitely possible. We are already working on this. Unfortunately, we are a bit short on manpower, so I'm not sure when or if this will happen. |
No offense intended. AQLM is a fantastic project, and VPTQ has acknowledged your work in its acknowledgments. I've successfully reproduced the VPTQ method and released several models on Hugging Face, including the 405B LLaMA 3.1, 70B LLaMA 3.1, and 72B LLaMA 3.2. I welcome discussion and testing—let's explore these together! |
Hey @OpenSourceRonin, |
would be cool
The text was updated successfully, but these errors were encountered: