[cherry-pick] Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn #43750

lidanqing-intel · 2022-06-22T05:59:28Z

PR types

Others

PR changes

Others

Describe

Fix resnet50 performance drop issue

paddle-bot-old · 2022-06-22T05:59:43Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

lidanqing-intel · 2022-06-22T08:20:37Z

@wozna @yeliang2258 Please review this cherry-pick PR.

lidanqing-intel · 2022-06-22T08:29:29Z

With this PR, performance results as follows. @yeliang2258 @jiangjiajun

Resnet50 FP32

KMP_BLOCKTIME=1 KMP_AFFINITY=granularity=fine,compact,1,0 GLOG_logtostderr=1 numactl -c 0 -m 0 ./build/sample_tester --infer_model=resnet50_fp32/model/ --infer_data=val_1000.bin     --batch_size=1     --num_threads=1     --iterations=0     --with_accuracy_layer=true    --use_analysis=true

I0622 08:32:15.652083 411584 sample_tester.cc:162] Model: resnet50_fp32/model/
I0622 08:32:15.652092 411584 sample_tester.cc:163] ====== num of threads: 1 ======
I0622 08:32:15.652096 411584 sample_tester.cc:164] ====== batch size: 1, iterations: 1000
I0622 08:32:15.652096 411584 sample_tester.cc:165] ====== batch latency: 50.0012ms, number of samples: 1000
I0622 08:32:15.652109 411584 sample_tester.cc:167] , sample latency: 50.0012ms, fps: 19.9995 ======
I0622 08:32:15.652285 411584 sample_tester.cc:309] Top1 accuracy: 0.7670
I0622 08:32:15.652299 411584 sample_tester.cc:311] Top5 accuracy: 0.9370

Resnet50 INT8

KMP_BLOCKTIME=1 KMP_AFFINITY=granularity=fine,compact,1,0 GLOG_logtostderr=1 ./build/sample_tester --infer_model=INT8     --infer_data=val_1000.bin     --batch_size=1     --num_threads=1     --iterations=0     --with_accuracy_layer=false    --use_analysis=true

I0622 08:33:34.907552 411585 sample_tester.cc:162] Model: INT8
I0622 08:33:34.907559 411585 sample_tester.cc:163] ====== num of threads: 1 ======
I0622 08:33:34.907562 411585 sample_tester.cc:164] ====== batch size: 1, iterations: 1000
I0622 08:33:34.907563 411585 sample_tester.cc:165] ====== batch latency: 13.4044ms, number of samples: 1000
I0622 08:33:34.907577 411585 sample_tester.cc:167] , sample latency: 13.4044ms, fps: 74.6022 ======
I0622 08:33:34.960709 411585 sample_tester.cc:309] Top1 accuracy: 0.7550
I0622 08:33:34.960731 411585 sample_tester.cc:311] Top5 accuracy: 0.9420

wozna

I can confirm that this PR should not influence on performance for any other model running with EnableMkldnn only for this Resnet50 int8 model.
LGTM

jiangjiajun

LGTM

ZeyuChen

LGTM

paddle-bot-old bot added contributor External developers status: proposed labels Jun 22, 2022

lidanqing-intel changed the title ~~Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn~~ [cherry-pick] Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn Jun 22, 2022

lidanqing-intel added the Intel label Jun 22, 2022

paddle-bot-old bot removed the status: proposed label Jun 22, 2022

wozna previously approved these changes Jun 22, 2022

View reviewed changes

lidanqing-intel dismissed wozna’s stale review via 571cd5f June 22, 2022 14:40

lidanqing-intel force-pushed the release/2.3-fix-resnet50-perf branch from 30f5496 to 571cd5f Compare June 22, 2022 14:40

lidanqing-intel mentioned this pull request Jun 22, 2022

[cherry-pick] release/2.3 elementwise_mul and matmul mkldnn fix #43725

Merged

remove slowing down pass

dcde6eb

jiangjiajun previously approved these changes Jun 23, 2022

View reviewed changes

lidanqing-intel dismissed jiangjiajun’s stale review via b6e5d25 June 23, 2022 06:01

lidanqing-intel force-pushed the release/2.3-fix-resnet50-perf branch 2 times, most recently from b6e5d25 to dcde6eb Compare June 23, 2022 06:08

ZeyuChen approved these changes Jun 23, 2022

View reviewed changes

ZeyuChen merged commit 096eb80 into PaddlePaddle:release/2.3 Jun 23, 2022

lidanqing-intel deleted the release/2.3-fix-resnet50-perf branch July 15, 2022 10:06

paddle-bot-old bot removed the contributor External developers label Oct 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cherry-pick] Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn #43750

[cherry-pick] Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn #43750

lidanqing-intel commented Jun 22, 2022 •

edited

Loading

paddle-bot-old bot commented Jun 22, 2022

lidanqing-intel commented Jun 22, 2022

lidanqing-intel commented Jun 22, 2022 •

edited

Loading

wozna left a comment

jiangjiajun left a comment

ZeyuChen left a comment

[cherry-pick] Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn #43750

[cherry-pick] Remove gpu_cpu_reshape2_matmul_fuse_pass in EnableMkldnn #43750

Conversation

lidanqing-intel commented Jun 22, 2022 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Jun 22, 2022

lidanqing-intel commented Jun 22, 2022

lidanqing-intel commented Jun 22, 2022 • edited Loading

wozna left a comment

Choose a reason for hiding this comment

jiangjiajun left a comment

Choose a reason for hiding this comment

ZeyuChen left a comment

Choose a reason for hiding this comment

lidanqing-intel commented Jun 22, 2022 •

edited

Loading

lidanqing-intel commented Jun 22, 2022 •

edited

Loading