Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Transformers export default sequence length to max_position_embeddings #1826

Merged
merged 2 commits into from
Nov 13, 2023

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented Nov 13, 2023

Usage

sparseml.transformers.export_onnx --model_path ./TinyLlama-1.1B-Chat-v0.3 --task text-generation --trust_remote_code 
2023-11-13 18:56:23 sparseml.transformers.export INFO     Attempting onnx export for model at /home/mgoin/models/TinyLlama-1.1B-Chat-v0.3 for task text-generation
2023-11-13 18:56:23 sparseml.transformers.export INFO     Using default sequence length of 2048
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

Reason

Unfortunately I think it is very important to export LLMs with the largest sequence length possible.

For instance I was exporting llama models with a sequence length of 512 for convenience
sparseml.transformers.export_onnx --model_path ./llama2.c-stories15M --task text-generation --sequence_length 512
However due to the PyTorch export and constant folding it seems to produce a cached, fixed rotary embedding of the sequence length used. See it as 512 in this screenshot. The ONNX can still run and compile with a sequence length larger than 512, but the output is unstable and quickly starts to repeat

Screenshot 2023-11-08 at 7 09 36 PM

Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - might want to have a fallback default in case max_position_embeddings is never set, but it may just be the case that HF has a default already

@mgoin
Copy link
Member Author

mgoin commented Nov 13, 2023

max_position_embeddings

The config class itself has a default value for this, so I'm inclined to leave it be. We could raise ValueError but that will be reflected downstream

@mgoin mgoin merged commit 1bcf835 into main Nov 13, 2023
11 checks passed
@mgoin mgoin deleted the default-sequence-length-config branch November 13, 2023 20:30
bfineran pushed a commit that referenced this pull request Nov 16, 2023
…beddings (#1826)

* Change Transformers export default sequence length to max_position_embeddings

* Fix style
bfineran pushed a commit that referenced this pull request Nov 16, 2023
…beddings (#1826)

* Change Transformers export default sequence length to max_position_embeddings

* Fix style
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants