Gemma ONNX export & ORT support #1714

fxmarty · 2024-02-23T09:58:46Z

As per title

echarlaix · 2024-02-23T11:45:54Z

actually as raised by @eaidova it looks like head_dim != hidden_size // num_attention_heads for some models

https://huggingface.co/google/gemma-7b/blob/main/config.json#L9

optimum/optimum/utils/input_generators.py

Line 1071 in 990c203

self.hidden_size // self.num_attention_heads,

fxmarty · 2024-02-23T11:55:43Z

@echarlaix I think we're fine, CI passes &

optimum-cli export onnx -m google/gemma-2b gemma_onnx followed by

from optimum.onnxruntime import ORTModelForCausalLM
from transformers import AutoTokenizer, AutoModelForCausalLM

model = ORTModelForCausalLM.from_pretrained("gemma_onnx")

tokenizer = AutoTokenizer.from_pretrained("gemma_onnx")

inp = tokenizer(["Today I am in Paris and", "I am"], padding=True, return_tensors="pt")

res = model.generate(**inp, max_new_tokens=20)

print(tokenizer.batch_decode(res))

###
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b")
res = model.generate(**inp, max_new_tokens=20)

print(tokenizer.batch_decode(res))

behaves as expected

* gemma onnx export * fix tests * fix model * fix

gemma onnx export

88546b5

echarlaix approved these changes Feb 23, 2024

View reviewed changes

fix tests

31d5984

fxmarty changed the title ~~Gemma ONNX export~~ Gemma ONNX export & ORT support Feb 23, 2024

echarlaix mentioned this pull request Feb 23, 2024

Native Support for Gemma #1710

Closed

4 tasks

echarlaix linked an issue Feb 23, 2024 that may be closed by this pull request

Native Support for Gemma #1710

Closed

4 tasks

fxmarty added 2 commits February 23, 2024 17:54

fix model

0597eae

fix

da1f492

fxmarty merged commit e0cbf7d into huggingface:main Feb 26, 2024
53 of 61 checks passed

fxmarty mentioned this pull request Feb 27, 2024

'gemma is not supported yet with the onnx backend' - Exporting on-the-fly to onnx #1724

Closed

Kaya-P mentioned this pull request Feb 27, 2024

Gemma Onnx suuport #1728

Closed

4 tasks

jacob-vincent-mink mentioned this pull request Mar 12, 2024

ONNX conversion of google/gemma-2b-it logits values are very different #1755

Open

4 tasks

young-developer pushed a commit to young-developer/optimum that referenced this pull request May 10, 2024

Gemma ONNX export & ORT support (huggingface#1714)

3367d3b

* gemma onnx export * fix tests * fix model * fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma ONNX export & ORT support #1714

Gemma ONNX export & ORT support #1714

fxmarty commented Feb 23, 2024

echarlaix commented Feb 23, 2024 •

edited

Loading

fxmarty commented Feb 23, 2024

Gemma ONNX export & ORT support #1714

Gemma ONNX export & ORT support #1714

Conversation

fxmarty commented Feb 23, 2024

echarlaix commented Feb 23, 2024 • edited Loading

fxmarty commented Feb 23, 2024

echarlaix commented Feb 23, 2024 •

edited

Loading