Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.50】为 Paddle lerp 算子实现 float16 数据类型支持 #50925

Merged
merged 8 commits into from
Apr 3, 2023

Conversation

denglianbin
Copy link
Contributor

PR types

Others

PR changes

Others

Describe

性能数据(op benchmark)

shape_x shape_y w fp32 fp16
16, 102400 16, 102400 0.2 0.232567349 0.213813782
16, 1, 1, 1 16, 3, 224, 224 0.5 0.332483953 0.30741424

@paddle-bot
Copy link

paddle-bot bot commented Feb 26, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@CLAassistant
Copy link

CLAassistant commented Feb 26, 2023

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

y = np.full(600, 1.0).astype(np.float16).reshape([50, 2, 2, 3])
w = np.random.random([3]).astype(np.float16)
self.inputs = {'X': x, 'Y': y, 'Weight': w}
self.outputs = {'Out': x + w * (y - x)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 这3个单测可以保留1个就可以,因为不同的shape在Kernel实现上并没有区分处理。
  • 代码可以简化:因为继承了TestLerp,这里只需要为fp16的case实现 init_dtype和init_shape就可以

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

您好

  • 这三个单测我保留了两个,这两个单侧在反向的kernel中实现是不同的。
  • 测试代码已经按照要求简化。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面的5个case和TestLerpWithFp16BroadWToXY 除了shape不同好像没有其他差异?x和y都是相同的shape,w.shape=[1],测试内容有些重复。上述5个case和TestLerpWithFp16BroadWToXY保留一个就可以。

另外,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY也可以直接重写init_shape和init_dtype简化吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

老师,上面五个case的和TestLerpWithFp16BroadWToXY是不同的。因为TestLerpWithFp16BroadWToXY的w.shape=[3],会走到broadcast w的反向分支。而上面五个case的w.shape=[1],会走到w为scalar的分支。
另外,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY重写init_shape和init_dtype也比较麻烦,因为TestLerpWihFp16BroadXY中X与Y的shape并不相同,然后TestLerpWithFp16BroadWToXY中w是random的shape=[3]的向量。不是基类中w.shape=[1]的标量。所以选择直接重写setup函数,不知道可不可以。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明白了,那TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY可以保留。
其实还是可以继承已有的TestLerpOp,但是可以给init_shape增加一个设置w的shape的功能,对TestLerpOp稍作修改,这样会更通用,代码也能得到简化。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

单测或许可以只保留3个case,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY,以及x和y相同shape,w.shape=[1]的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的老师,已经按照相关说明更改。在基类中增加了init_wshape、init_xyshape。其中init_xyshape用来对x、y shape不相等时初始化。并且简化了原来FP32的TestLerpBroadXY、TestLerpBroadWToXY。

@zhangting2020
Copy link
Contributor

记得提交中文文档修改,并在PR描述中引用

@luotao1
Copy link
Contributor

luotao1 commented Mar 14, 2023

@denglianbin 可以优先处理下这个PR的问题

@denglianbin
Copy link
Contributor Author

@zhangting2020 您好,这是更新中文文档的pr: https://github.com/PaddlePaddle/docs/pull/5741。

self.shape = [2, 1, 2, 5, 1, 5]

def init_dtype(self):
self.dtype = np.float16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面这5个case除了维度不同在调用的kernel上应该并没有区别。x、y的shape都一样,甚至元素数量都是100,从测试覆盖情况来看相同的测试内容保留一个规模最大的即可。原始的fp32单测写的太累赘了,不需要完全保证相同数量的case。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
老师,他五个case在正向kernel中是走的五个分支。FP32这样写应该是要测试每个维度的正确性,这种情况需要修改吗?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里面其实都是走的同一个LerpFuntion,这里之所以写了这么多分支是因为eigen实现,模版参数必须设置dim的数值。但其实走的是同一个函数,计算的方式上没有差异的。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

y = np.full(600, 1.0).astype(np.float16).reshape([50, 2, 2, 3])
w = np.random.random([3]).astype(np.float16)
self.inputs = {'X': x, 'Y': y, 'Weight': w}
self.outputs = {'Out': x + w * (y - x)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面的5个case和TestLerpWithFp16BroadWToXY 除了shape不同好像没有其他差异?x和y都是相同的shape,w.shape=[1],测试内容有些重复。上述5个case和TestLerpWithFp16BroadWToXY保留一个就可以。

另外,TestLerpWihFp16BroadXY和TestLerpWithFp16BroadWToXY也可以直接重写init_shape和init_dtype简化吧?

self.res_ref = self.x + self.w * (self.y - self.x)
self.place = []
if core.is_compiled_with_cuda():
self.place.append(paddle.CUDAPlace(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestLerpAPIFp16这个是为了测试什么?继承的是unittest.TestCase,不会运行op

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是忘记删除的,已经删除。

Copy link
Contributor

@zhangting2020 zhangting2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit a2cbc81 into PaddlePaddle:develop Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants