Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MPT onnx and ORT support #1161

Merged
merged 6 commits into from
Aug 31, 2023
Merged

Add MPT onnx and ORT support #1161

merged 6 commits into from
Aug 31, 2023

Conversation

jiqing-feng
Copy link
Contributor

Hi @fxmarty @echarlaix

Relate to 1101. This PR enables MPT models onnx config to support generating dummy inputs.

The shape arrangement of past_key_values may be different across models. We can use the member variable sequence_length of DummyPastKeyValuesGenerator to identify the sequence length of past_key_values.

Would you please help me review it? Thanks

cc @changwangss

@fxmarty
Copy link
Contributor

fxmarty commented Jul 6, 2023

Hi @jiqing-feng , thank you for the PR! I'm personally not in favor of supporting custom models in Optimum that otherwise use trust_remote_code=True. I think it would make more sense to host custom ONNX configs in the model repos directly, and allow the main_export function to use a custom ONNX config.

What is your opinion @echarlaix @JingyaHuang @michaelbenayoun ?

I propose #1166 #1143 in this regard, see the API here: https://moon-ci-docs.huggingface.co/docs/optimum/pr_1166/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models

@jiqing-feng
Copy link
Contributor Author

Hi @fxmarty . Thanks for your comment.

The MPT model is going to upstream on transformers, see 24629. We do not need trust_remote_code=True after the PR is merged.

We can wait for the 24629 merge and then we can discuss if we can move forward with my PR.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 6, 2023

The documentation is not available anymore as the PR was closed or merged.

@JingyaHuang
Copy link
Collaborator

@fxmarty @jiqing-feng For the ease of maintenance, Optimum shall only support configs of transformers native models. For those withtrust_remote_code(chatglm and mpt for now), hosting directly on the hub sounds very nice to me.

btw, mpt sounds like a good example to test a bit with what you suggest @fxmarty

@fxmarty
Copy link
Contributor

fxmarty commented Jul 6, 2023

@jiqing-feng Yes, once mpt is merged & released in transformers for sure we can merge this PR in Optimum! My suggestion is more for models with custom modeling code (as mpt is currently).

@jiqing-feng
Copy link
Contributor Author

Hi @fxmarty . Since MPT has been merged to HF models, see 24629.

Could we merge this PR for supporting MPT onnx config?

cc @JingyaHuang

@fxmarty
Copy link
Contributor

fxmarty commented Aug 1, 2023

Hi @jiqing-feng, for sure we can merge once there is a transformers release! Is the KV cache layout fine?

@jiqing-feng
Copy link
Contributor Author

Hi @jiqing-feng, for sure we can merge once there is a transformers release! Is the KV cache layout fine?

Yes! The KV cache layout is the same as most CLM models like llama. We can wait for the release.

@jiqing-feng
Copy link
Contributor Author

jiqing-feng commented Aug 29, 2023

Hi @fxmarty @JingyaHuang . I think we can move forward on this PR since the recently released version of transformers contained MPT model. I see that some checks have failed. Would you please tell me where and what kind of tests should I add? Thx!

@echarlaix
Copy link
Collaborator

echarlaix commented Aug 29, 2023

Would you please tell me where and what kind of tests should I add? Thx!

Could you add ONNX Runtime tests as well by adding the model name here and here using a tiny random model like this one?

@jiqing-feng
Copy link
Contributor Author

Would you please tell me where and what kind of tests should I add? Thx!

Could you add ONNX Runtime tests as well by adding the model name here and here using a tiny random model like this one?

Done. Thx!

@jiqing-feng
Copy link
Contributor Author

Hi @echarlaix @fxmarty @JingyaHuang

Could we merge this PR? I see that the failed checks are not related to my changes and the mpt model has been released in the latest released version of transformers. Thx!

@fxmarty fxmarty changed the title support MPT onnx generate dummy inputs Add MPT onnx and ORT support Aug 31, 2023
Copy link
Contributor

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiqing-feng Thank you for the addition and apologize for the delay!

@fxmarty fxmarty merged commit 7450ca3 into huggingface:main Aug 31, 2023
63 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants