Use `optimize_module` pass for the quantization to fp16 #1974

umangyadav · 2023-07-18T23:25:59Z

BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.

1/sqrt((var+epsilon)) * gamma can be const folded.

Adding optimize_module() pass before quantize_fp16() would do the simplification and const-folding.

This PR along with #1969 produces FP16 ResNet50 accuracy same as Fp32 model ~76.45%

codecov · 2023-07-18T23:44:42Z

Codecov Report

Merging #1974 (f2d5ef5) into develop (04ae9b8) will increase coverage by 0.05%.
The diff coverage is 100.00%.

❗ Current head f2d5ef5 differs from pull request most recent head 4147411. Consider uploading reports for the commit 4147411 to get more accurate results

@@             Coverage Diff             @@
##           develop    #1974      +/-   ##
===========================================
+ Coverage    91.37%   91.43%   +0.05%     
===========================================
  Files          420      421       +1     
  Lines        15593    15590       -3     
===========================================
+ Hits         14248    14254       +6     
+ Misses        1345     1336       -9

Impacted Files	Coverage Δ
src/onnx/parse_batchnorm.cpp	`93.75% <100.00%> (-0.37%)`	⬇️
src/quantization.cpp	`86.04% <100.00%> (-0.32%)`	⬇️
src/tf/parse_batchnorm.cpp	`100.00% <100.00%> (ø)`

... and 2 files with indirect coverage changes

kahmed10 · 2023-07-19T01:08:36Z

src/onnx/parse_batchnorm.cpp

If optimize_module is all that's needed, why do the parsers need to be modified?

parser changes the order of computation.

It calculuates 1/sqrt((var+epsilon)) * gamma first and then multiplies with x - mean(x)

Previously it did (x - mean(x)) / sqrt(var+eps) first and then multiplied with gamma. which is not amenable to const-folding.

migraphx-bot · 2023-07-19T04:04:16Z

Test	Batch	Rate new 414741	Rate old 04ae9b	Diff	Compare
torchvision-resnet50	64	2,276.34	2,276.69	-0.02%	✅
torchvision-resnet50_fp16	64	5,329.78	5,330.98	-0.02%	✅
torchvision-densenet121	32	1,819.12	1,824.31	-0.28%	✅
torchvision-densenet121_fp16	32	3,381.44	3,367.05	0.43%	✅
torchvision-inceptionv3	32	1,340.81	1,344.65	-0.29%	✅
torchvision-inceptionv3_fp16	32	2,534.87	2,531.51	0.13%	✅
cadene-inceptionv4	16	677.23	677.03	0.03%	✅
cadene-resnext64x4	16	588.46	588.32	0.03%	✅
slim-mobilenet	64	7,212.11	7,217.40	-0.07%	✅
slim-nasnetalarge	64	236.30	236.01	0.12%	✅
slim-resnet50v2	64	2,519.37	2,455.52	2.60%	✅
bert-mrpc-onnx	8	718.86	718.69	0.02%	✅
bert-mrpc-tf	1	363.49	363.03	0.13%	✅
pytorch-examples-wlang-gru	1	310.36	309.84	0.17%	✅
pytorch-examples-wlang-lstm	1	316.37	317.18	-0.26%	✅
torchvision-resnet50_1	1	554.15	555.26	-0.20%	✅
torchvision-inceptionv3_1	1	305.86	307.06	-0.39%	✅
cadene-dpn92_1	1	358.84	346.25	3.64%	🔆
cadene-resnext101_1	1	219.86	220.01	-0.07%	✅
slim-vgg16_1	1	223.61	223.84	-0.10%	✅
slim-mobilenet_1	1	1,475.46	1,482.99	-0.51%	✅
slim-inceptionv4_1	1	222.29	224.66	-1.05%	✅
onnx-taau-downsample	1	320.88	321.02	-0.04%	✅
dlrm-criteoterabyte	1	21.65	21.65	-0.01%	✅
dlrm-criteoterabyte_fp16	1	40.59	40.57	0.04%	✅
agentmodel	1	5,875.06	5,943.57	-1.15%	✅
unet_fp16	2	55.00	55.09	-0.15%	✅

Check results before merge 🔆

migraphx-bot · 2023-07-19T04:04:18Z

:white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'

:white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

:white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

:white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output

🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

❌slim-vgg16_1: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'

❌slim-mobilenet_1: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'

❌slim-inceptionv4_1: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'

:white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

:white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

:white_check_mark:unet: PASSED: MIGraphX meets tolerance

src/pass_manager.cpp

Fixes #1746 BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.

umangyadav requested review from kahmed10, shivadbhavsar and turneram July 18, 2023 23:26

umangyadav self-assigned this Jul 18, 2023

umangyadav added the high priority A PR with high priority for review and merging. label Jul 18, 2023

umangyadav force-pushed the batchnorm_fp16 branch from 2ffa002 to e4ea8d6 Compare July 18, 2023 23:35

umangyadav added 3 commits July 18, 2023 18:49

Use optimize_module pass for the quantization

55d61f9

formatting

dea2a09

change order of multiplication

0da1e71

umangyadav force-pushed the batchnorm_fp16 branch from e4ea8d6 to 0da1e71 Compare July 18, 2023 23:49

umangyadav and others added 3 commits July 19, 2023 00:06

Fix test

e5803ab

Formatting

b50b5ca

Merge branch 'develop' into batchnorm_fp16

0a14e61

kahmed10 reviewed Jul 19, 2023

View reviewed changes

src/pass_manager.cpp Outdated Show resolved Hide resolved

umangyadav and others added 2 commits July 19, 2023 09:11

remove trace for run_passes()

b5fa13a

Merge branch 'develop' into batchnorm_fp16

d4b3397

kahmed10 approved these changes Jul 19, 2023

View reviewed changes

turneram approved these changes Jul 19, 2023

View reviewed changes

shivadbhavsar approved these changes Jul 19, 2023

View reviewed changes

Merge branch 'develop' into batchnorm_fp16

4147411

causten merged commit 6f1f4b5 into develop Jul 21, 2023
11 checks passed

causten deleted the batchnorm_fp16 branch July 21, 2023 13:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `optimize_module` pass for the quantization to fp16 #1974

Use `optimize_module` pass for the quantization to fp16 #1974

umangyadav commented Jul 18, 2023 •

edited

Loading

codecov bot commented Jul 18, 2023 •

edited

Loading

kahmed10 Jul 19, 2023

umangyadav Jul 19, 2023 •

edited

Loading

migraphx-bot commented Jul 19, 2023 •

edited

Loading

migraphx-bot commented Jul 19, 2023 •

edited

Loading

Use optimize_module pass for the quantization to fp16 #1974

Use optimize_module pass for the quantization to fp16 #1974

Conversation

umangyadav commented Jul 18, 2023 • edited Loading

codecov bot commented Jul 18, 2023 • edited Loading

Codecov Report

kahmed10 Jul 19, 2023

Choose a reason for hiding this comment

umangyadav Jul 19, 2023 • edited Loading

Choose a reason for hiding this comment

migraphx-bot commented Jul 19, 2023 • edited Loading

migraphx-bot commented Jul 19, 2023 • edited Loading

Use `optimize_module` pass for the quantization to fp16 #1974

Use `optimize_module` pass for the quantization to fp16 #1974

umangyadav commented Jul 18, 2023 •

edited

Loading

codecov bot commented Jul 18, 2023 •

edited

Loading

umangyadav Jul 19, 2023 •

edited

Loading

migraphx-bot commented Jul 19, 2023 •

edited

Loading

migraphx-bot commented Jul 19, 2023 •

edited

Loading