Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【PFCC算子性能优化】为Paddle优化adaptive_pooling_op性能 #45959

Merged
merged 3 commits into from
Sep 20, 2022

Conversation

OuyangChao
Copy link
Contributor

@OuyangChao OuyangChao commented Sep 12, 2022

PR types

Performance optimization

PR changes

OPs

Describe

  • document: PFCC PR 244
  • device: GeForce RTX 3080 (Compute Capability: 8.6)
  • data_type: float32
  • data_format: NCHW
  • diff: (before - after) / before
Case No. input_shape output_shape pooling_type before kernel_time(ms) after kernel_time(ms) diff
0 [128,64,112,112] [56,56] AVG 5.7967 0.7395 87.24%
1 [128,512,7,7] [1,1] AVG 0.0341 0.0341 0.00%
2 [128,2048,7,7] [1,1] AVG 0.1239 0.1242 -0.24%
3 [4,2048,64,128] [32,32] AVG 2.1829 0.4311 80.25%
4 [128,64,224,224] [112,112] AVG 19.1376 3.2681 82.92%
5 [128,128,112,112] [56,56] AVG 11.0257 1.4704 86.66%
6 [128,256,56,56] [28,28] AVG 6.7391 0.7834 88.38%
7 [128,512,28,28] [14,14] AVG 3.5089 0.4116 88.27%
8 [128,512,14,14] [7,7] AVG 0.9216 0.1138 87.65%
9 [128,64,112,112] [56,56] MAX 5.9602 0.9065 84.79%
10 [128,512,7,7] [1,1] MAX 0.0895 0.0739 17.43%
11 [128,2048,7,7] [1,1] MAX 0.3027 0.2711 10.44%
12 [4,2048,64,128] [32,32] MAX 2.2829 0.4811 78.93%
13 [128,64,224,224] [112,112] MAX 21.6426 4.2093 80.55%
14 [128,128,112,112] [56,56] MAX 11.8736 1.8121 84.74%
15 [128,256,56,56] [28,28] MAX 6.8098 0.891 86.92%
16 [128,512,28,28] [14,14] MAX 3.5734 0.4537 87.30%
17 [128,512,14,14] [7,7] MAX 0.9262 0.1214 86.89%

TODO

  • adaptive pooling forward optimization
  • adaptive pooling backward optimization

@paddle-bot
Copy link

paddle-bot bot commented Sep 12, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@luotao1 luotao1 added the contributor External developers label Sep 13, 2022
@@ -92,12 +92,12 @@ class AvgPoolGrad {
*/
HOSTDEVICE inline int AdaptStartIndex(int ph, int input_size, int output_size) {
return static_cast<int>(
floor(static_cast<double>(ph * input_size) / output_size));
floor(static_cast<float>(ph * input_size) / output_size));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前优化的时候没想到这里可以有性能提升,Good one!

const int padding_width,
T1* output_data,
T2* mask_data,
FastDivModForPooling divmods) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

divmods 这参数在计算中没有采用,可以删除,或者想采用的话可以替换 Line1986-1987中的计算div 和 mod计算.

const int stride_width,
const int padding_height,
const int padding_width,
FastDivModForPooling divmods,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

divmods 参数使用的建议同下

@JamesLim-sy
Copy link
Contributor

这些建议留在反向的PR中修改吧

Copy link
Contributor

@JamesLim-sy JamesLim-sy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JamesLim-sy JamesLim-sy merged commit 6d06786 into PaddlePaddle:develop Sep 20, 2022
@OuyangChao
Copy link
Contributor Author

这些建议留在反向的PR中修改吧

好的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants