You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After fine-tuning and quantizing the gemme-2b-it model, the average inference speed slowed down (increased from 120ms to 160ms). Could there be an issue with the quantization? I'm attaching parts of the code for fine-tuning and quantization.
The text was updated successfully, but these errors were encountered:
Yimjaehyun93
changed the title
Performance Degradation After Quantizing Fine-Tuned gemme-2b-it Model
Slowed Down After Quantizing Fine-Tuned gemme-2b-it Model
Aug 5, 2024
After fine-tuning and quantizing the gemme-2b-it model, the average inference speed slowed down (increased from 120ms to 160ms). Could there be an issue with the quantization? I'm attaching parts of the code for fine-tuning and quantization.
The text was updated successfully, but these errors were encountered: