Skip to content

Release 2.1.0

Latest
Compare
Choose a tag to compare
@ofirgo ofirgo released this 28 May 08:05
· 46 commits to main since this release
9d3593f

What's Changed

General changes:

  • Quantization enhancements:
    • Improved quantization parameters: Backpropagate the threshold of concatenation layers. This helps to minimize data loss during the quantization of these layer types.
    • Improved weights quantization parameters selection: Introduced Hessian-based MSE quantization error method.
      • Set weights_error_method to QuantizationErrorMethod.HMSE in QuantizationConfig in CoreConfig
      • Currently, this feature is only available in GPTQ due to the increased runtime required for Hessian computation.
    • Improved mixed precision: Use normalized MSE as distance metric in mixed precision sensitivity evaluation for non Hessian-based methods.
    • Improved mixed precision runtime: Added a validation step to determine whether quantizing the model to a requested target resource utilization requires mixed precision, or it can be achieved by quantizing the model to the maximal bit-width precision available.
    • Automatically removed identity layers to improve graph optimizations..
  • Introduced TPC IMX500.v2:
    • Enabled a new feature: metadata. A metadata is a dictionary that is saved in the model file and object that contains information about the MCT environment (e.g. MCT version, framework version, etc.).
    • Quantize unfolded BatchNorm layers.
    • Default TPC remains IMX500.v1. For selecting IMX500.v2 use:
      • tpc_v2 = mct.get_target_platform_capabilities("tensorflow", 'imx500', target_platform_version="v2")
      • mct.ptq.keras_post_training_quantization(model, representative_data_gen, target_platform_capabilities=tpc_v2)

Tutorials

MCT tutorial notebooks updates:

  • Reorganized the tutorials into separate sections: IMX500 and MCT features.
  • Added new tutorials for IMX500: an object detection YOLOv8n quantization in Keras and PyTorch, including an optional Gradient-Based PTQ step for optimized performance.
  • Removed the “quick-start” integration tool from MCT.

Breaking changes:

  • TF 2.11 is no longer supported.

Bug fixes:

  • Fixed a bug in the GPTQ parameters update.
  • Fixed a bug in the similarity analyzer when bias correction is used.
  • Fixed a bug in logging tf.image.combined_non_max_suppression to Tensorboard (#1055).