You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I cannot seem to load a dynamically quantized roberta model for cpu inference in ONNX format.
I can load the pre-quantized model just fine. Currently working on a Vertex AI instance on GCP.
Expected behavior
The Engine is expected to load the model.
Environment
Include all relevant environment information:
OS [e.g. Ubuntu 18.04]: linux 5.10.0-30-cloud-amd64
Python version [e.g. 3.8]: Python 3.10.14
DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]: 1.7.1
ML framework version(s) [e.g. torch 1.7.1]: torch==1.13.1, transformers==4.40.2
Describe the bug
I cannot seem to load a dynamically quantized roberta model for cpu inference in ONNX format.
I can load the pre-quantized model just fine. Currently working on a Vertex AI instance on GCP.
Expected behavior
The Engine is expected to load the model.
Environment
Include all relevant environment information:
f7245c8
]: 1.7.1To Reproduce
Errors
Additional context
When using static quantization this error does not occur.
The text was updated successfully, but these errors were encountered: