-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use optimize_module
pass for the quantization to fp16
#1974
Conversation
2ffa002
to
e4ea8d6
Compare
Codecov Report
@@ Coverage Diff @@
## develop #1974 +/- ##
===========================================
+ Coverage 91.37% 91.43% +0.05%
===========================================
Files 420 421 +1
Lines 15593 15590 -3
===========================================
+ Hits 14248 14254 +6
+ Misses 1345 1336 -9
|
e4ea8d6
to
0da1e71
Compare
src/onnx/parse_batchnorm.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If optimize_module is all that's needed, why do the parsers need to be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parser changes the order of computation.
It calculuates 1/sqrt((var+epsilon)) * gamma
first and then multiplies with x - mean(x)
Previously it did (x - mean(x)) / sqrt(var+eps)
first and then multiplied with gamma. which is not amenable to const-folding.
Check results before merge 🔆 |
❌bert-mrpc-tf: ERROR - check error outputTraceback (most recent call last):File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in main() File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main import tensorflow as tf ModuleNotFoundError: No module named 'tensorflow' 🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output❌slim-vgg16_1: ERROR - check error outputTraceback (most recent call last):File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in main() File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main import tensorflow as tf ModuleNotFoundError: No module named 'tensorflow' ❌slim-mobilenet_1: ERROR - check error outputTraceback (most recent call last):File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in main() File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main import tensorflow as tf ModuleNotFoundError: No module named 'tensorflow' ❌slim-inceptionv4_1: ERROR - check error outputTraceback (most recent call last):File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 281, in main() File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 226, in main import tensorflow as tf ModuleNotFoundError: No module named 'tensorflow' |
Fixes #1746 BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.
Fixes #1746 BatchNorm only has x as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.
Fixes #1746
BatchNorm only has
x
as the runtime input parameter for the following equation. All the other parameters are compile-time constants and related operations can be const-folded before quantizing to fp16 to preserve precision.1/sqrt((var+epsilon)) * gamma
can be const folded.Adding
optimize_module()
pass beforequantize_fp16()
would do the simplification and const-folding.This PR along with #1969 produces FP16 ResNet50 accuracy same as Fp32 model ~76.45%