You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I got the error RuntimeError: probability tensor contains either inf, nan or element < 0 when trying to run deepspeed_engine.generate when Meta-Llama-3-8B-Instruct is initialized with either 4-bit or 8-bit quantization.
Note that this bug is specific to meta-llama/Meta-Llama-3-8B-Instruct. If I replace meta-llama/Meta-Llama-3-8B-Instruct with kevin009/babyllama-v0.6, no error would be raised.
Atry
changed the title
[BUG] RuntimeError encountered when generating tokens from a Meta-Llama-3-8B-Instruct model initialized with 4-bit quantization
[BUG] RuntimeError encountered when generating tokens from a Meta-Llama-3-8B-Instruct model initialized with 4-bit or 8-bit quantization
Jun 11, 2024
Describe the bug
I got the error
RuntimeError: probability tensor contains either
inf,
nanor element < 0
when trying to run deepspeed_engine.generate whenMeta-Llama-3-8B-Instruct
is initialized with either 4-bit or 8-bit quantization.To Reproduce
Run the following code
Then the output is
Expected behavior
No error
ds_report output
Screenshots
Not applicable
System info (please complete the following information):
Launcher context
Just
python
cli, notdeepspeed
cli.Docker context
Not using Docker
Additional context
The text was updated successfully, but these errors were encountered: