Releases · neuralmagic/deepsparse

11 Aug 16:15

jeanniefinks

v0.6.1

2b605ef

DeepSparse v0.6.1 Patch Release

This is a patch release for 0.6.0 that contains the following changes:

Users no longer experience crashes

when running the ReduceSum operation in the DeepSparse Engine.
when running operations on tensors that are 8- or 16-bit integers, or booleans, on AVX2.

Assets 6

30 Jul 23:03

jeanniefinks

v0.6.0

08a210c

DeepSparse v0.6.0

New Features:

DeepSparse Engine optimized for Sparse FP32 BERT.
- Optimized BERT model collection now in the SparseZoo.
- Performance improvement example includes 5x increased throughput on PruneBERT (281 seq/sec) compared to dense BERT (53 seq/sec) at batch size 32 and sequence length 128 (AWS c5.12xlarge).
Optimized Tanh operator support provided.
Hugging Face transformers pipeline APIs added for NLP models.
Hugging Face transformers examples added for benchmarking, deploying, and sample application.
Ultralytics YOLOv5 example support added.

Changes:

Performance improvements made for:
- all networks when running on multi-socket machines, especially those with large outputs.
- batched Softmax and Reduce operators with many threads available.
- Reshape operators when multiple dimensions are combined into one or one dimension is split into multiple.
- stacked matrix multiplications by supporting more input layouts.
YOLOv3 example integration was generalized to ultralytics-yolo in support of both V3 and V5.

Resolved Issues:

Engine now runs on architectures with more than one NUMA node per socket.

Known Issues:

None

Assets 6

30 Jun 16:23

jeanniefinks

v0.5.1

8e0242b

DeepSparse v0.5.1 Patch Release

This is a patch release for 0.5.0 that contains the following changes:

resolution to address an issue that caused a performance regression on YOLOv5 and could have affected the correctness of some models.

Assets 6

28 Jun 18:01

jeanniefinks

v0.5.0

ec3c3f4

DeepSparse v0.5.0

New Features:

None

Changes:

Performance optimizations implemented for binary elementwise operations, where both inputs come from the same source buffer. One of the inputs may have intermediate unary operations.
Performance optimizations implemented for binary elementwise operations where one of the inputs is a constant scalar.
Small performance improvement for large batch sizes (> 64) on quantized ResNet.

Resolved Issues:

Assertion deepsparse num_sockets removed when too many sockets were requested, causing users to experience a crash.
Rare assertion failure fixed when a nonlinearity appeared between an elementwise addition and a convolution or gemm.
Broken URLs for classification and detection examples updated in the contained READMEs.

Known Issues:

None

Assets 6

04 Jun 20:52

jeanniefinks

v0.4.0

16c7915

DeepSparse v0.4.0

New Features:

New operator support implemented for Expand.
Slice operator support for positive step sizes. Only slice operations that operate on a single axis are supported. Previously, slice was only supported for constant tensors and step size equal to one.

Changes:

Memory usage of compiled models reduced.
Memory layout for matrix multiplications in Transformers optimized.
Precision for swish and sigmoid operations improved.
Runtime performance improved for some networks whose outputs are immediately preceded by transpose operators.
Runtime performance of softmax operations improved.
Readme redesigned for better clarity on the repository's purpose.

Resolved Issues:

Using the multi-stream scheduler, when more threads were selected than the number of cores on the system, it no longer causes a performance hit.
Neural Magic dependencies upgrade to intended bug versions instead of minor versions.

Known Issues:

None

Assets 6

14 May 00:02

jeanniefinks

v0.3.1

7ea8298

DeepSparse v0.3.1 Patch Release

This is a patch release for 0.3.0 that contains the following changes:

Docs updated for new Discourse and Slack links
Check added for supported Python version so DeepSparse does not improperly install on unsupported systems

Assets 6

30 Apr 23:54

jeanniefinks

v0.3.0

54c7027

DeepSparse v0.3.0

New Features:

Multi-stream scheduler added as a configurable option to the engine.

Changes:

Errors related to setting the NUMA memory policy are now issued as warnings.
Improved compilation times for sparse networks.
Performance improvements made for: networks with large outputs and multi-socket machines; ResNet-50 v1 quantized and kernel sparsity gemms.
Copy operations and placement of quantization operations within network optimized.
Version changed to be loaded from version.py file, default build on branches is now nightly.
cpu.py file and related APIs added to DeepSparse repo instead of copying over from backend.
Add unsupported system install errors for end users when running on non-Linux systems.
YOLOv3 batch 64 quantized now has a speedup of 16% in the DeepSparse Engine.

Resolved Issues:

An assertion is no longer triggered when more sockets or threads than available are requested.
Resolved assertion when performing Concat operations on constant buffers.
Engine no longer crashes when the output of a QLinearMatMul operation has a dimension not divisible by 4.
The engine now starts without crashing on Windows Subsystem for Linux and Docker for Windows or Docker for Mac.

Known Issues:

None

Assets 6

31 Mar 23:11

jeanniefinks

v0.2.0

1852e90

DeepSparse v0.2.0

New Features:

None

Changes:

Dense convolutions on AVX2 systems were optimized, improving performance for many non-pruned networks. In particular, this results in a speed improvement for batch size 64 ResNet-50 of up to 28% on Intel AVX2 systems and up to 39% on AMD AVX2 systems.
Operations to shuffle activations in engine optimized, resulting in up to 14% speed improvement for batch size 64 pruned quantized MobileNetV1.
Performance improvements made for networks with large output arrays.

Resolved Issues:

Engine no longer fails with an assert when running some quantized networks.
Some Resize operators were not optimized if they had a ROI input.
Memory leak addressed on multi-socket systems when batch size > 1.
Docs and readme corrections made for minor issues and broken links.
Makefile no longer deletes files for docs compilation and cleaning.

Known Issues:

In rare cases where a tensor, used as the input or output to an operation, is larger than 2GB, the engine can segfault. Users should decrease the batch size as a workaround.
In some cases, models running complicated pre- or post-processing steps could diminish the DeepSparse Engine performance by up to a factor of 10x due to hyperthreading, as two engine threads can run on the same physical core. Address the performance issue by trying the following recommended solutions in order of preference:
1. Enable thread binding
If that does not give performance benefit or you want to try additional options:
1. Use the numactl utility to prevent the process from running on hyperthreads.
2. Manually set the thread affinity in Python as follows:
```
import os
from deepsparse.cpu import cpu_architecture
ARCH = cpu_architecture()

if ARCH.vendor == "GenuineIntel":
    os.sched_setaffinity(0, range(ARCH.num_physical_cores()))
elif ARCH.vendor == "AuthenticAMD":
    os.sched_setaffinity(0, range(0, 2*ARCH.num_physical_cores(), 2))
else:
    raise RuntimeError(f"Unknown CPU vendor {ARCH.vendor}")
```

Assets 5

01 Mar 19:56

jeanniefinks

v0.1.1

4940121

DeepSparse v0.1.1 Patch Release

This is a patch release for 0.1.0 that contains the following changes:

Docs updates: tagline, overview, update to use sparsification for verbiage
Examples updated to use new ResNet-50 pruned_quant moderate model from the SparseZoo
Nightly build dependencies now match on major.minor and not full version
Benchmarking script added for reproducing ResNet-50 numbers
Small (3-5%) performance improvement for pruned quantized ResNet-50 models, for batch size greater than 16
Reduced memory footprint for networks with sparse fully connected layers
Improved performance on multi-socket systems when batch size is larger than 1

Assets 5

04 Feb 21:21

jeanniefinks

v0.1.0

6036701

DeepSparse v0.1.0 First GitHub Release

Welcome to our initial release on GitHub! Older release notes can be found here.

New Features:

Operator support enabled:
- QLinearAdd
- 2D QLinearMatMul when the second operand is constant
Multi-stream support added for concurrent requests.
Examples for benchmarking, classification flows, detection flows, and Flask servers added.
Jupyter Notebooks for classification and detection flows added.
MakeFile flows and utilities implemented for GitHub repo structure.

Changes:

Software packaging updated to reflect new GitHub distribution channel, from file naming conventions to license enforcement removal.
Initial startup message updated with improved language.
Distribution now manylinux2014 compliant; support for Ubuntu 16.04 deprecated.
QuantizeLinear operations now use division instead of scaling by reciprocal for small quantization scales.
Small performance improvements made on some quantized networks with nontrivial activation zero points.

Resolved Issues:

Networks with sparse quantized convolutions and nontrivial activation zero points now have consistent correct results.
Crash no longer occurs for some models where a quantized depthwise convolution follows a non-depthwise quantized convolution.

Known Issues:

None

Assets 5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Features:

Changes:

Resolved Issues:

Known Issues:

New Features:

Changes:

Resolved Issues:

Known Issues:

New Features:

Changes:

Resolved Issues:

Known Issues:

New Features:

Changes:

Resolved Issues:

Known Issues:

New Features:

Changes:

Resolved Issues:

Known Issues:

New Features:

Changes:

Resolved Issues:

Known Issues:

Releases: neuralmagic/deepsparse

DeepSparse v0.6.1 Patch Release

DeepSparse v0.6.0

New Features:

Changes:

Resolved Issues:

Known Issues:

DeepSparse v0.5.1 Patch Release

DeepSparse v0.5.0

New Features:

Changes:

Resolved Issues:

Known Issues:

DeepSparse v0.4.0

New Features:

Changes:

Resolved Issues:

Known Issues:

DeepSparse v0.3.1 Patch Release

DeepSparse v0.3.0

New Features:

Changes:

Resolved Issues:

Known Issues:

DeepSparse v0.2.0

New Features:

Changes:

Resolved Issues:

Known Issues:

DeepSparse v0.1.1 Patch Release

DeepSparse v0.1.0 First GitHub Release

New Features:

Changes:

Resolved Issues:

Known Issues: