Skip to content

Getting started: using the new features of MIGraphX 0.4

mvermeulen edited this page Aug 23, 2019 · 26 revisions

New Features in MIGraphX 0.4

MIGraphX 0.4 supports the following new features:

  • Quantization support for fp16 and int8
  • Support for NLP models, particularly BERT with both Tensorflow and ONNX examples

This page provides examples and pointers of how to use these new features.

Quantization

Release 0.4 adds support for INT8 quantization as well as FP16 previously introduced in release 0.3. One aspect that int8 quantization differs from fp16 is that MIGraphX needs to determine "scale factors" to convert between fp32 and int8 values. There are two methods of determining such scale factors:

  • MIGraphX has built-in heuristics to pick factors or
  • MIGraphX quantization int8 quantization functions can accept as input a set of "calibration data". The model is run with this calibration data and scale factors are determined by measuring intermediate inputs. The format of the quantization data is the same as data later used for evaluation

The APIs MIGraphX provides for quantization have been updated to the following:

...to be added...

BERT, natural language processing (NLP) model

Release 0.4 includes improvements so that MIGraphX can optimize the BERT NLP model. Cookbook examples are included for both ONNX and Tensorflow frozen graphs. These examples are based on the following repositories: