Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xlm-roberta and Mistral-7B take significant amounts of memory during compilation #2821

Open
cjvolzka opened this issue May 9, 2024 · 2 comments

Comments

@cjvolzka
Copy link
Collaborator

cjvolzka commented May 9, 2024

While compiling models like HuggingFace protectai/xlm-roberta-base-language-detection-onnx or mistralai/Mistral-7B-v0.1 I notice we take significantly larger amounts of memory than the entire model size during compiling.

For example, the xlm-roberta-base-language-detection-onnx is about 1.11GB but during compile time I see peaks up to 9GB of memory used by onnx-mlir, opt and llc compiling with --O3 --EmitLib --mtriple=s390x-ibm-loz --mcpu=z14 --onnx-op-stats TXT.

The Mistral-7B-v0.1 model is about 29GB but during compile time I see peaks up to 70+Gb and sustained 58GB memory compiling with --O3 --EmitLib --mtriple=s390x-ibm-loz --mcpu=z14 --store-constants-to-file --onnx-op-stats TXT

Is there anything that can be done to reduce the compile time memory required for these kind of models?

@cjvolzka cjvolzka changed the title Models take significant amounts of memory to compile xlm-roberta and Mistral-7B take significant amounts of memory during compilation May 9, 2024
@imaihal
Copy link
Collaborator

imaihal commented May 10, 2024

@cjvolzka How can we get onnx model for Mistral-7B-v0.1 ?

@cjvolzka
Copy link
Collaborator Author

@imaihal Sorry, I missed your question. Below is how I generated the Mistral onnx model.

Notes:

  • I exported the model using my Mac as the tools don't support s390x. Afterward, I transferred the folder it created (with the onnx file and constants) to the s390x host to compile the model.
  • the huggingface-cli comand will ask a couple of questions:
pip install huggingface_cli optimum
huggingface-cli login
optimum-cli export onnx --model mistralai/Mistral-7B-v0.1 --framework pt --atol 0.001 --task text-generation Mistral-7B-v0.1-text-generation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants