QAT folding update #1639

anmarques · 2023-06-23T22:13:40Z

Incorporate these changes to the ONNXToDeepsparse QAT folding:

In addition, fixed the following issues:

relaxed pattern matching for Convolution folding that was leading to unfolded convs
allowed MatMul weights to be in either input 0 or input 1

Testing plan:

Exported the following models w/o QAT folding and then manually applied ONNXToDeepsparse to fold graphs:
YOLOv8n base_quant
MobileBERT 14layer_pruned50_quant-none-vnni
YOLOv5s pruned50_quant
DistilBERT one-shot pruned quantized

…ar for input branch of Conv node

…parseml into feature/qat_folding_update

bfineran · 2023-07-10T20:16:13Z

GHA failure unrelated - merging given extensive testing and previous reviews

anmarques added 3 commits June 23, 2023 17:28

Add transformation to propagate dequantize op through split

031125f

Remove requirement that QuantizeLinear must be next to DequantizeLine…

c54c289

…ar for input branch of Conv node

Fixed embedding quantization propagation

20632ce

anmarques requested a review from bfineran June 23, 2023 22:13

anmarques added 14 commits June 23, 2023 18:15

Quality fixes

4384a11

Merge branch 'main' into feature/qat_folding_update

c64310e

Merge branch 'main' into feature/qat_folding_update

b5ea337

Add zero point to dequant node

a98bd91

Merge branch 'feature/qat_folding_update' of github.com:neuralmagic/s…

5316cc8

…parseml into feature/qat_folding_update

Add zero point to initializers

32db97b

Style fixes

8e18613

Fix data type

fdae854

Allow MatMul weight to be on either input 0 or 1

85c6828

Style fixes

f2bf1d7

Add padding value

f904e49

Make initializers distinct

8e15c59

Style and quality fixes

1cdbacb

Merge branch 'main' into feature/qat_folding_update

d703d4f

anmarques requested a review from natuan June 27, 2023 14:58

anmarques added 3 commits June 28, 2023 09:33

Merge branch 'main' into feature/qat_folding_update

8f47a93

Merge branch 'main' into feature/qat_folding_update

80d678b

Merge branch 'main' into feature/qat_folding_update

252ac11

abhinavnmagic previously approved these changes Jul 1, 2023

View reviewed changes

natuan previously approved these changes Jul 1, 2023

View reviewed changes

anmarques added 2 commits July 3, 2023 10:52

Merge branch 'main' into feature/qat_folding_update

8b622a4

Make bias optional for Conv QAT conversion

7629f5e

anmarques dismissed stale reviews from abhinavnmagic and natuan via 7629f5e July 5, 2023 17:59

anmarques added 2 commits July 5, 2023 16:39

Quality fix

d189627

Merge branch 'main' into feature/qat_folding_update

ce5bf37

anmarques added 2 commits July 10, 2023 11:05

Merge branch 'main' into feature/qat_folding_update

03b4279

Merge branch 'main' into feature/qat_folding_update

9058a6c

bfineran approved these changes Jul 10, 2023

View reviewed changes

bfineran merged commit d0ba055 into main Jul 10, 2023
9 of 10 checks passed

bfineran deleted the feature/qat_folding_update branch July 10, 2023 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QAT folding update #1639

QAT folding update #1639

anmarques commented Jun 23, 2023 •

edited

Loading

bfineran commented Jul 10, 2023

QAT folding update #1639

QAT folding update #1639

Conversation

anmarques commented Jun 23, 2023 • edited Loading

bfineran commented Jul 10, 2023

anmarques commented Jun 23, 2023 •

edited

Loading