Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use optimize_module pass for the quantization to fp16 #1974

Merged
merged 9 commits into from
Jul 21, 2023
Merged

Conversation

umangyadav
Copy link
Member

@umangyadav umangyadav commented Jul 18, 2023

Fixes #1746

BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.

1/sqrt((var+epsilon)) * gamma can be const folded.

Adding optimize_module() pass before quantize_fp16() would do the simplification and const-folding.

image

This PR along with #1969 produces FP16 ResNet50 accuracy same as Fp32 model ~76.45%

@umangyadav umangyadav self-assigned this Jul 18, 2023
@umangyadav umangyadav added the high priority A PR with high priority for review and merging. label Jul 18, 2023
@codecov
Copy link

codecov bot commented Jul 18, 2023

Codecov Report

Merging #1974 (f2d5ef5) into develop (04ae9b8) will increase coverage by 0.05%.
The diff coverage is 100.00%.

❗ Current head f2d5ef5 differs from pull request most recent head 4147411. Consider uploading reports for the commit 4147411 to get more accurate results

@@             Coverage Diff             @@
##           develop    #1974      +/-   ##
===========================================
+ Coverage    91.37%   91.43%   +0.05%     
===========================================
  Files          420      421       +1     
  Lines        15593    15590       -3     
===========================================
+ Hits         14248    14254       +6     
+ Misses        1345     1336       -9     
Impacted Files Coverage Δ
src/onnx/parse_batchnorm.cpp 93.75% <100.00%> (-0.37%) ⬇️
src/quantization.cpp 86.04% <100.00%> (-0.32%) ⬇️
src/tf/parse_batchnorm.cpp 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If optimize_module is all that's needed, why do the parsers need to be modified?

Copy link
Member Author

@umangyadav umangyadav Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parser changes the order of computation.

It calculuates 1/sqrt((var+epsilon)) * gamma first and then multiplies with x - mean(x)

Previously it did (x - mean(x)) / sqrt(var+eps) first and then multiplied with gamma. which is not amenable to const-folding.

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Jul 19, 2023

Test Batch Rate new
414741
Rate old
04ae9b
Diff Compare
torchvision-resnet50 64 2,276.34 2,276.69 -0.02%
torchvision-resnet50_fp16 64 5,329.78 5,330.98 -0.02%
torchvision-densenet121 32 1,819.12 1,824.31 -0.28%
torchvision-densenet121_fp16 32 3,381.44 3,367.05 0.43%
torchvision-inceptionv3 32 1,340.81 1,344.65 -0.29%
torchvision-inceptionv3_fp16 32 2,534.87 2,531.51 0.13%
cadene-inceptionv4 16 677.23 677.03 0.03%
cadene-resnext64x4 16 588.46 588.32 0.03%
slim-mobilenet 64 7,212.11 7,217.40 -0.07%
slim-nasnetalarge 64 236.30 236.01 0.12%
slim-resnet50v2 64 2,519.37 2,455.52 2.60%
bert-mrpc-onnx 8 718.86 718.69 0.02%
bert-mrpc-tf 1 363.49 363.03 0.13%
pytorch-examples-wlang-gru 1 310.36 309.84 0.17%
pytorch-examples-wlang-lstm 1 316.37 317.18 -0.26%
torchvision-resnet50_1 1 554.15 555.26 -0.20%
torchvision-inceptionv3_1 1 305.86 307.06 -0.39%
cadene-dpn92_1 1 358.84 346.25 3.64% 🔆
cadene-resnext101_1 1 219.86 220.01 -0.07%
slim-vgg16_1 1 223.61 223.84 -0.10%
slim-mobilenet_1 1 1,475.46 1,482.99 -0.51%
slim-inceptionv4_1 1 222.29 224.66 -1.05%
onnx-taau-downsample 1 320.88 321.02 -0.04%
dlrm-criteoterabyte 1 21.65 21.65 -0.01%
dlrm-criteoterabyte_fp16 1 40.59 40.57 0.04%
agentmodel 1 5,875.06 5,943.57 -1.15%
unet_fp16 2 55.00 55.09 -0.15%

Check results before merge 🔆

@migraphx-bot
Copy link
Collaborator

migraphx-bot commented Jul 19, 2023


    :white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error outputTraceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'


    :white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

    :white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

    :white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output


🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output


    :white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

❌slim-vgg16_1: ERROR - check error outputTraceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'


❌slim-mobilenet_1: ERROR - check error outputTraceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'


❌slim-inceptionv4_1: ERROR - check error outputTraceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main
import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'


    :white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

    :white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

    :white_check_mark:unet: PASSED: MIGraphX meets tolerance

src/pass_manager.cpp Outdated Show resolved Hide resolved
@causten causten merged commit 6f1f4b5 into develop Jul 21, 2023
11 checks passed
@causten causten deleted the batchnorm_fp16 branch July 21, 2023 13:51
kahmed10 pushed a commit that referenced this pull request Jul 27, 2023
Fixes #1746

BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.
kahmed10 pushed a commit that referenced this pull request Jul 27, 2023
Fixes #1746

BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority A PR with high priority for review and merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ResNet50 accuracy
6 participants