Release Release 2.1.0 · sony/model_optimization

What's Changed

General changes:

Quantization enhancements:
- Improved quantization parameters: Backpropagate the threshold of concatenation layers. This helps to minimize data loss during the quantization of these layer types.
- Improved weights quantization parameters selection: Introduced Hessian-based MSE quantization error method.
  - Set weights_error_method to QuantizationErrorMethod.HMSE in QuantizationConfig in CoreConfig
  - Currently, this feature is only available in GPTQ due to the increased runtime required for Hessian computation.
- Improved mixed precision: Use normalized MSE as distance metric in mixed precision sensitivity evaluation for non Hessian-based methods.
- Improved mixed precision runtime: Added a validation step to determine whether quantizing the model to a requested target resource utilization requires mixed precision, or it can be achieved by quantizing the model to the maximal bit-width precision available.
- Automatically removed identity layers to improve graph optimizations..
Introduced TPC IMX500.v2:
- Enabled a new feature: metadata. A metadata is a dictionary that is saved in the model file and object that contains information about the MCT environment (e.g. MCT version, framework version, etc.).
- Quantize unfolded BatchNorm layers.
- Default TPC remains IMX500.v1. For selecting IMX500.v2 use:
  - tpc_v2 = mct.get_target_platform_capabilities("tensorflow", 'imx500', target_platform_version="v2")
  - mct.ptq.keras_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v2)

Tutorials

MCT tutorial notebooks updates:

Reorganized the tutorials into separate sections: IMX500 and MCT features.
Added new tutorials for IMX500: an object detection YOLOv8n quantization in Keras and PyTorch, including an optional Gradient-Based PTQ step for optimized performance.
Removed the “quick-start” integration tool from MCT.

Breaking changes:

TF 2.11 is no longer supported.

Bug fixes:

Fixed a bug in the GPTQ parameters update.
Fixed a bug in the similarity analyzer when bias correction is used.
Fixed a bug in logging tf.image.combined_non_max_suppression to Tensorboard (#1055).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.1.0

What's Changed

General changes:

Tutorials

Breaking changes:

Bug fixes: