Gemma2 Support #529

yc-wang00 · 2024-07-01T12:29:59Z

Hi team, I am opening this issue to request support for the Google Gemma 2 models.

Recently, Google released two models: google/gemma-2-27b and google/gemma-2-9b. For an initial trial, we attempted to use the existing Gemma path for these new models, but it didn't work as expected. Specifically, when I tried to quantize google/gemma-2-9b, the model just produce non-sense outputs.

Could someone please investigate and add support to gemma2?

Thank you very much!!!

casper-hansen · 2024-07-02T19:45:55Z

I made an initial attempt that did not work. main...gemma2. Unfortunately, I do not have enough time at the moment to do further research on how to support the new architecture.

The biggest change I see for quantizing the model is that it now has a pre-feedforward and post-feedforward layernorm. So there is some challenge in trying to correctly quantize with AWQ. Maybe @TechxGenus or someone else can help contribute

TechxGenus · 2024-07-03T05:41:19Z

There are still many issues (logits soft cap, fp16, sliding window e.g.) in gemma2 community support. I suggest waiting for them all to be resolved.

radi-cho mentioned this issue Jul 31, 2024

Add Gemma2 support. #562

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma2 Support #529

Gemma2 Support #529

yc-wang00 commented Jul 1, 2024

casper-hansen commented Jul 2, 2024

TechxGenus commented Jul 3, 2024

Gemma2 Support #529

Gemma2 Support #529

Comments

yc-wang00 commented Jul 1, 2024

casper-hansen commented Jul 2, 2024

TechxGenus commented Jul 3, 2024