Skip to content

DeepSparse v0.10.0

Compare
Choose a tag to compare
@jeanniefinks jeanniefinks released this 03 Feb 16:40
b27fbda

New Features:

  • Quantization support enabled on AVX2 instruction set for GEMM and elementwise operations.
  • NM_SPOOF_ARCH environment variable added for testing different architectural configurations.
  • Elastic scheduler implemented as an alternative to the single-stream or multi-stream schedulers.
  • deepsparse.benchmark application is now usable from the command-line after installing deepsparse to simplify benchmarking.
  • deepsparse.server CLI and API added with transformers support to make serving models like BERT with pipelines easy.

Changes:

  • More robust architecture detection added to help resolve CPU topology, such as when running inside a virtual machine.
  • Tensor columns improved, leading to significant speedups from 5 to 20% in pruned YOLO (larger batch size), BERT (smaller batch size), MobileNet, and ResNet models.
  • Sparse quantized network performance improved on machines that do not support VNNI instructions.
  • Performance improved for dense BERT with large batch sizes.

Resolved Issues:

  • Possible crashes eliminated for:
    • Pooling operations with small image sizes
    • Rarely, networks containing convolution or GEMM operations
    • Some models with many residual connections

Known Issues:

  • None