Skip to content

Commit

Permalink
Updates to the Changelog for 6.2 (#3258)
Browse files Browse the repository at this point in the history
  • Loading branch information
causten authored and TedThemistokleous committed Aug 21, 2024
1 parent 28692b4 commit 3f8adcd
Showing 1 changed file with 76 additions and 0 deletions.
76 changes: 76 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,82 @@
Full documentation for MIGraphX is available at
[https://rocmdocs.amd.com/projects/AMDMIGraphX/en/latest/](https://rocmdocs.amd.com/projects/AMDMIGraphX/en/latest/).

## MIGraphX 2.10 for ROCm 6.2.0

### Additions

* Added support for ONNX Runtime MIGraphX EP on Windows
* Added FP8 Python API
* Added examples for SD 2.1 and SDXL
* Improved Dynamic Batch to support BERT
* Added a `--test` flag in migraphx-driver to validate the installation
* Added support for ONNX Operator: Einsum
* Added uint8 support in ONNX Operators
* Enabled Split-k kernel configurations for performance improvements
* Added fusion for group convolutions
* Added rocMLIR conv3d support
* Added rocgdb to the Dockerfile


### Optimizations

* Improved ONNX Model Zoo coverage
* Reorganized memcpys with ONNX Runtime to improve performance
* Replaced scaler multibroadcast + unsqueeze with just a multibroadcast
* Improved MLIR kernel selection for multibroadcasted GEMMs
* Improved details of the perf report
* Enable mlir by default for GEMMs with small K
* Allow specifying dot or convolution fusion for mlir with environmental flag
* Improve performance on small reductions by doing multiple reduction per wavefront
* Add additional algebraic simplifications for mul-add-dot sequence of operations involving constants
* Use MLIR attention kernels in more cases
* Enables MIOpen and CK fusions for MI300 gfx arches
* Support for QDQ quantization patterns from Brevitas which have explicit cast/convert nodes before and after QDQ pairs
* Added Fusion of "contiguous + pointwise" and "layout + pointwise" operations which may result in performance gains in certain cases
* Added Fusion for "pointwise + layout" and "pointwise + contiguous" operations which may result in performance gains when using NHWC layout
* Added Fusion for "Pointwise + concat" operation which may help in performance in certain cases
* Fixes a bug in "concat + pointwise" fusion where output shape memory layout wasn't maintained
* Simplifies "slice + concat" pattern in SDXL UNet
* eliminates ZeroPoint/Shift in QuantizeLinear or DeQuantizeLinear ops if zero points values are zeros
* Improved inference performance by fusing Reduce to Broadcast
* Added additional information when printing the perf report
* Improve scalar fusions when not all strides are 0
* Added support for multi outputs in pointwise ops
* Improve reduction fusion with reshape operators
* Use the quantized output when an operator is used again
* Enabled Split-k GEMM perf configs for rocMLIR based GEMM kernels for better performance on all Hardware


### Fixes

* Super Resolution model verification failed with FP16
* Suppressed confusing messages when compiling the model
* Mod operator failed to compile with int8 and int32 inputs
* Prevented spawning too many threads for constant propagation when parallel STL is not enabled
* Fixed a bug when running migraphx-driver with the --run 1 option
* Layernorm Accuracy fix: calculations in FP32
* Update Docker generator script to ROCm 6.1 to point at Jammy
* Floating Point exception fix for dim (-1) in reshape operator
* Fixed issue with int8 accuracy and models which were failing due to requiring a fourth bias input
* Fixed missing inputs not previously handled for quantized bias for the weights, and data values of the input matrix
* Fixed order of operations for int8 quantization which were causing inaccuracies and slowdowns
* Removed list initializer of prefix_scan_sum which was causing issues during compilation and resulting in the incorrect constructor to be used at compile
* Fixed the MIGRAPHX_GPU_COMPILE_PARALLEL flag to enable users to control number of threads used for parallel compilation



### Changes

* Changed default location of libraries with release specific ABI changes
* Reorganized documentation in GitHub


### Removals

* Removed the `--model` flag with migraphx-driver



## MIGraphX 2.9 for ROCm 6.1.0

### Additions
Expand Down

0 comments on commit 3f8adcd

Please sign in to comment.