Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemma2 Support #529

Open
yc-wang00 opened this issue Jul 1, 2024 · 2 comments
Open

Gemma2 Support #529

yc-wang00 opened this issue Jul 1, 2024 · 2 comments

Comments

@yc-wang00
Copy link

Hi team, I am opening this issue to request support for the Google Gemma 2 models.

Recently, Google released two models: google/gemma-2-27b and google/gemma-2-9b. For an initial trial, we attempted to use the existing Gemma path for these new models, but it didn't work as expected. Specifically, when I tried to quantize google/gemma-2-9b, the model just produce non-sense outputs.

Could someone please investigate and add support to gemma2?

Thank you very much!!!

@casper-hansen
Copy link
Owner

I made an initial attempt that did not work. main...gemma2. Unfortunately, I do not have enough time at the moment to do further research on how to support the new architecture.

The biggest change I see for quantizing the model is that it now has a pre-feedforward and post-feedforward layernorm. So there is some challenge in trying to correctly quantize with AWQ. Maybe @TechxGenus or someone else can help contribute

@TechxGenus
Copy link
Contributor

There are still many issues (logits soft cap, fp16, sliding window e.g.) in gemma2 community support. I suggest waiting for them all to be resolved.

@radi-cho radi-cho mentioned this issue Jul 31, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants