xlm-roberta and Mistral-7B take significant amounts of memory during compilation #2821

cjvolzka · 2024-05-09T00:56:51Z

While compiling models like HuggingFace protectai/xlm-roberta-base-language-detection-onnx or mistralai/Mistral-7B-v0.1 I notice we take significantly larger amounts of memory than the entire model size during compiling.

For example, the xlm-roberta-base-language-detection-onnx is about 1.11GB but during compile time I see peaks up to 9GB of memory used by onnx-mlir, opt and llc compiling with --O3 --EmitLib --mtriple=s390x-ibm-loz --mcpu=z14 --onnx-op-stats TXT.

The Mistral-7B-v0.1 model is about 29GB but during compile time I see peaks up to 70+Gb and sustained 58GB memory compiling with --O3 --EmitLib --mtriple=s390x-ibm-loz --mcpu=z14 --store-constants-to-file --onnx-op-stats TXT

Is there anything that can be done to reduce the compile time memory required for these kind of models?

The text was updated successfully, but these errors were encountered:

imaihal · 2024-05-10T08:05:50Z

@cjvolzka How can we get onnx model for Mistral-7B-v0.1 ?

cjvolzka · 2024-05-28T20:44:45Z

@imaihal Sorry, I missed your question. Below is how I generated the Mistral onnx model.

Notes:

I exported the model using my Mac as the tools don't support s390x. Afterward, I transferred the folder it created (with the onnx file and constants) to the s390x host to compile the model.
the huggingface-cli comand will ask a couple of questions:
- Use https://huggingface.co/settings/tokens to generate the token it requests
- You don't need to add the token as a git credential

pip install huggingface_cli optimum
huggingface-cli login
optimum-cli export onnx --model mistralai/Mistral-7B-v0.1 --framework pt --atol 0.001 --task text-generation Mistral-7B-v0.1-text-generation

cjvolzka changed the title ~~Models take significant amounts of memory to compile~~ xlm-roberta and Mistral-7B take significant amounts of memory during compilation May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xlm-roberta and Mistral-7B take significant amounts of memory during compilation #2821

xlm-roberta and Mistral-7B take significant amounts of memory during compilation #2821

cjvolzka commented May 9, 2024

imaihal commented May 10, 2024

cjvolzka commented May 28, 2024

xlm-roberta and Mistral-7B take significant amounts of memory during compilation #2821

xlm-roberta and Mistral-7B take significant amounts of memory during compilation #2821

Comments

cjvolzka commented May 9, 2024

imaihal commented May 10, 2024

cjvolzka commented May 28, 2024