【Hackathon 5th No.3】为 Paddle 新增 masked_fill API #57355

AndSonder · 2023-09-15T02:18:01Z

PR types

Others

PR changes

Others

Description

为 Paddle 新增 masked_fill API

RFC 文档：https://github.com/PaddlePaddle/community/pull/616/files

paddle-bot · 2023-09-15T02:18:13Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

AndSonder · 2023-09-21T02:10:59Z

@zoooo0820 麻烦研发老师review一下~ ，除了 doc ci 的问题应该都可以了

zoooo0820 · 2023-09-21T03:08:37Z

python/paddle/tensor/manipulation.py

+    #     broadcast_mask = paddle.cast(broadcast_mask, 'bool')
+
+    # if in_dynamic_mode():
+    #     return _C_ops.where_(broadcast_mask, broadcast_x, broadcast_value)


请移除非说明类的注释内容

zoooo0820 · 2023-09-21T03:09:25Z

python/paddle/tensor/manipulation.py

+        value (Scaler or 0-D Tensor): The value used to fill the target tensor.
+        name(str, optional): The default value is None. Normally there is no
+            need for user to set this property. For more information, please
+            refer to :ref:`api_guide_Name`.


参数说明中写明支持的dtype，且要和设计文档中的一致； Scaler -> Scalar

zoooo0820 · 2023-09-21T03:13:18Z

python/paddle/tensor/manipulation.py

+        value = paddle.full([1], value, x.dtype)
+
+    mask = paddle.bitwise_not(mask)
+    out = paddle.where(mask, x, value)


这里是否可以把value和x对调，省去一个not op操作

这里应该是不能对调的，paddle.where(cond, x, y) 在 inplace 的时候对boardcast的处理如下：

zeros_like_x = paddle.zeros_like(x) zeros_like_y = paddle.zeros_like(y) zeros_like_condition = paddle.zeros_like(condition) zeros_like_condition = paddle.cast(zeros_like_condition, x.dtype) cast_cond = paddle.cast(condition, x.dtype) broadcast_zeros = paddle.add(zeros_like_x, zeros_like_y) broadcast_zeros = paddle.add(broadcast_zeros, zeros_like_condition) broadcast_x = x.add_(broadcast_zeros) broadcast_y = paddle.add(y, broadcast_zeros) broadcast_condition = paddle.add(cast_cond, broadcast_zeros) broadcast_condition = paddle.cast(broadcast_condition, 'bool')

其中 broadcast_x = x.add_(broadcast_zeros) 用了 inplace 的操作，如果调换位置就会导致 masked_fill 和 mask_fill_ 运行结果不一致问题。broadcast 也会出问题。我一开始写的时候写的时候也是用的 paddle.where(mask, value, x) 但是跑单侧的时候就很多报错，inplace 和非 inplace 的梯度信息也不一样

zoooo0820 · 2023-09-21T03:14:51Z

python/paddle/tensor/manipulation.py

+                    [2., 2., 1.]])
+    """
+    if np.isscalar(value):
+        value = paddle.full([1], value, x.dtype)


这里虽然不影响结果，从Scalar的语义来看，是否应该full([], ...) 构建一个0-D Tensor

zoooo0820 · 2023-09-21T03:43:50Z

python/paddle/tensor/manipulation.py

+        value = paddle.full([1], value, x.dtype)
+
+    mask = paddle.bitwise_not(mask)
+    out = paddle.where_(mask, x, value)


同上，是否可以通过调换x , value 来避免额外的not操作

zoooo0820 · 2023-09-21T03:48:05Z

test/legacy_test/test_masked_fill_op.py

+                fetch_list=[out],
+            )
+            np.testing.assert_allclose(
+                res[0], self.out_np, atol=1e-5, rtol=1e-5


这个地方，如果是因为fp16/bf16的特殊case的话，建议给fp16/bf16单独设置下阈值，其他case使用默认阈值

这个阈值一般应该设置多少呢

这个阈值一般应该设置多少呢

默认即可，不要特意减小.

fp16单测过不去的话可以单独拿出来设置下

zoooo0820 · 2023-09-21T03:48:24Z

test/legacy_test/test_masked_fill_op.py

+        mask = paddle.to_tensor(self.mask_np).astype('bool')
+        value = paddle.to_tensor(self.value_np, dtype=self.dtype)
+        result = paddle.masked_fill(x, mask, value)
+        np.testing.assert_allclose(self.out_np, result.numpy(), rtol=1e-05)


zoooo0820 · 2023-09-21T03:50:54Z

test/legacy_test/test_masked_fill_op.py

@@ -0,0 +1,187 @@
+# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.


这个文件建议不用命名成test_xxx_op，这个pr中并没有新增这个op，直接test_xxx就行

zoooo0820 · 2023-09-21T03:53:14Z

test/legacy_test/test_masked_fill_op.py

+        self.x_shape = (300, 1)
+        self.mask_shape = (300, 40)
+        self.dtype = "float16"
+


测试用例需要和设计文档中的对齐一下：

可以补充一个bf16的case

0-dvalue 的case，需要一起验证下反向

验证反向是指加在 test_inplace 里面的单侧吗

验证反向是指加在 test_inplace 里面的单侧吗

直接在这个文件中写一个带backward的case，验证value.grad是否为预期即可

验证反向是指加在 test_inplace 里面的单侧吗

直接在这个文件中写一个带backward的case，验证value.grad是否为预期即可

请问有类似的例子吗，我看其他测试backward的都是新添加op的情况下去测，构建了一个静态图去跑。没有看到python端组合的api去测backward的代码

验证反向是指加在 test_inplace 里面的单侧吗

直接在这个文件中写一个带backward的case，验证value.grad是否为预期即可

请问有类似的例子吗，我看其他测试backward的都是新添加op的情况下去测，构建了一个静态图去跑。没有看到python端组合的api去测backward的代码

添加单测原因是，需要验证下目前组合的API的方式是否满足设计预期的效果，可参考：

Paddle/test/legacy_test/test_set_value_op.py

Line 1478 in 521f8e6

loss.backward()

Paddle/test/legacy_test/test_tensor_fill_diagonal_tensor.py

Line 59 in 521f8e6

(y.grad.numpy().astype('float32') == expected_grad).all(),

验证反向是指加在 test_inplace 里面的单侧吗

直接在这个文件中写一个带backward的case，验证value.grad是否为预期即可

请问有类似的例子吗，我看其他测试backward的都是新添加op的情况下去测，构建了一个静态图去跑。没有看到python端组合的api去测backward的代码

添加单测原因是，需要验证下目前组合的API的方式是否满足设计预期的效果，可参考：

Paddle/test/legacy_test/test_set_value_op.py

Line 1478 in 521f8e6

loss.backward()

Paddle/test/legacy_test/test_tensor_fill_diagonal_tensor.py

Line 59 in 521f8e6

(y.grad.numpy().astype('float32') == expected_grad).all(),

好的，我参考看一下，谢谢研发老师

…o masked_fill

AndSonder · 2023-09-26T06:44:06Z

还麻烦研发老师帮忙看下doc格式和 windows 下 bf16单侧的问题

AndSonder · 2023-10-11T08:37:33Z

还麻烦研发老师帮忙看下doc格式和 windows 下 bf16单侧的问题
@AndSonder

code-style问题请参考报错信息更新下precommit，或者根据给出的diff信息修改下代码

bf16问题，不太好直接确认原因，初步看了下可能跟初始化设置有关，建议参考下其他单测，使用 convert_float_to_uint16 / convert_float_to_uint16 完成转换

辛苦再看下剩余的CI问题：

coverage 覆盖率不足，需要补充value非tensor的case

codestyle挂了

static-check显示有print的使用，辛苦检查下除文档以外的部分是否有

此外 third_party 的变动不要加到这个PR中

已添加value非tensor的case
codestyle 已修复
已检查，只有文档部分有print
third_party 的改动已去除

zoooo0820 · 2023-10-12T03:40:18Z

@AndSonder
新增的单测中有报错，辛苦再看下；此外覆盖率未过，也辛苦再调整单测case覆盖下

AndSonder · 2023-10-12T13:18:01Z

@AndSonder 新增的单测中有报错，辛苦再看下；此外覆盖率未过，也辛苦再调整单测case覆盖下

已经都修复了，辛苦研发老师再看下

AndSonder · 2023-10-16T06:38:43Z

还麻烦研发老师有空的时候帮忙看下还有没有什么其他问题 @zoooo0820

zoooo0820

LGTM

jeff41404 · 2023-10-18T04:01:36Z

python/paddle/tensor/manipulation.py

+    if np.isscalar(value):
+        value = paddle.full([], value, x.dtype)
+
+    mask = paddle.bitwise_not(mask)


In this scenario, paddle.logical_not is more safe and semantical than paddle.bitwise_not, or if paddle.logical_not is used, it is necessary to check whether the data type of mask is bool, as other types may produce unexpected results.

jeff41404 · 2023-10-18T04:02:49Z

python/paddle/tensor/manipulation.py

+    if np.isscalar(value):
+        value = paddle.full([], value, x.dtype)
+
+    mask = paddle.bitwise_not(mask)


Same as the above issue

jeff41404

LGTM

sunzhongkai588 · 2023-10-20T03:54:22Z

python/paddle/tensor/manipulation.py

+        x (Tensor) : The Destination Tensor. Supported data types are float,
+        double, int, int64_t,float16 and bfloat16.
+        mask (Tensor): The boolean tensor indicate the position to be filled.
+            The data type of mask must be bool.
+        value (Scalar or 0-D Tensor): The value used to fill the target tensor.
+        Supported data types are float, double, int, int64_t,float16 and bfloat16.
+        name(str, optional): The default value is None. Normally there is no
+            need for user to set this property. For more information, please
+            refer to :ref:`api_guide_Name`.


Suggested change

x (Tensor) : The Destination Tensor. Supported data types are float,

double, int, int64_t,float16 and bfloat16.

mask (Tensor): The boolean tensor indicate the position to be filled.

The data type of mask must be bool.

value (Scalar or 0-D Tensor): The value used to fill the target tensor.

Supported data types are float, double, int, int64_t,float16 and bfloat16.

name(str, optional): The default value is None. Normally there is no

need for user to set this property. For more information, please

refer to :ref:`api_guide_Name`.

x (Tensor) : The Destination Tensor. Supported data types are float,

double, int, int64_t,float16 and bfloat16.

mask (Tensor): The boolean tensor indicate the position to be filled.

The data type of mask must be bool.

value (Scalar or 0-D Tensor): The value used to fill the target tensor.

Supported data types are float, double, int, int64_t,float16 and bfloat16.

name(str, optional): The default value is None. Normally there is no

need for user to set this property. For more information, please

refer to :ref:`api_guide_Name`.

sunzhongkai588 · 2023-10-20T03:56:00Z

python/paddle/tensor/manipulation.py

+        .. code-block:: python
+            >>> # doctest: +REQUIRES(env:GPU)


code-block 下加空行

Suggested change

.. code-block:: python

>>> # doctest: +REQUIRES(env:GPU)

.. code-block:: python

>>> # doctest: +REQUIRES(env:GPU)

sunzhongkai588 · 2023-10-20T03:56:37Z

python/paddle/tensor/manipulation.py

+        .. code-block:: python
+            >>> # doctest: +REQUIRES(env:GPU)


Suggested change

.. code-block:: python

>>> # doctest: +REQUIRES(env:GPU)

.. code-block:: python

>>> # doctest: +REQUIRES(env:GPU)

sunzhongkai588

LGTM，请同步提供中文文档

AndSonder · 2023-10-26T09:51:33Z

LGTM，请同步提供中文文档

已提供中文文档

【Hackathon 5th No.3】为 Paddle 新增 masked_fill API docs#6254

@sunzhongkai588

paddle-ci-bot · 2023-11-01T03:14:15Z

Sorry to inform you that bccd1f6's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

AndSonder · 2023-11-02T06:43:42Z

@luotao1 本 PR 的中文文档也没问题了，还麻烦帮忙看看可以合入了吗

* add masked_fill for paddle * update doc * update some test case * remove full_like * update test codes * update test cases * recover codes * update test codes * fix gradients error * update test codes * fix * add bf16 test cases * update code-block * update code-block * update test codes * Update __init__.py * fix * fix code style and recover third_party * add v grad check * add scalar value case * fix test case * use logical_not * fix doc style * Update manipulation.py

add masked_fill for paddle

4925ae4

paddle-bot bot added the contributor External developers label Sep 15, 2023

update doc

fbec78a

luotao1 added the PaddlePaddle Hackathon label Sep 15, 2023

Ligoml changed the title ~~为 Paddle 新增 masked_fill API~~ 【Hackathon 5th No.3】为 Paddle 新增 masked_fill API Sep 15, 2023

luotao1 assigned luotao1 and zoooo0820 Sep 15, 2023

update some test case

3cfde41

AndSonder changed the title ~~【Hackathon 5th No.3】为 Paddle 新增 masked_fill API~~ 为 Paddle 新增 masked_fill API Sep 15, 2023

AndSonder added 4 commits September 15, 2023 09:23

remove full_like

37dd891

update test codes

793fcdc

update test cases

0c1e8ce

recover codes

faa459a

AndSonder changed the title ~~为 Paddle 新增 masked_fill API~~ 【Hackathon 5th No.3】为 Paddle 新增 masked_fill API Sep 19, 2023

update test codes

0a4d8d4

Ligoml mentioned this pull request Sep 19, 2023

【PaddlePaddle Hackathon 5th】开源贡献个人挑战赛 #57262

Open

AndSonder added 2 commits September 20, 2023 08:23

fix gradients error

746a818

update test codes

d48d73d

zoooo0820 reviewed Sep 21, 2023

View reviewed changes

AndSonder and others added 7 commits September 21, 2023 07:53

fix

7a0ef26

add bf16 test cases

de8cd75

update code-block

fb1a2ea

Merge branch 'masked_fill' of https://github.com/AndSonder/Paddle int…

810575f

…o masked_fill

Merge branch 'develop' into masked_fill

1b213bf

update code-block

df6013e

Merge branch 'masked_fill' of https://github.com/AndSonder/Paddle int…

82bb307

…o masked_fill

AndSonder requested a review from zoooo0820 September 26, 2023 06:44

AndSonder added 2 commits October 11, 2023 07:58

add v grad check

f5db8f6

add scalar value case

0f48c13

fix test case

c8040ce

AndSonder requested a review from zoooo0820 October 12, 2023 13:17

zoooo0820 approved these changes Oct 16, 2023

View reviewed changes

jeff41404 reviewed Oct 18, 2023

View reviewed changes

AndSonder added 2 commits October 17, 2023 23:40

use logical_not

ccb535c

Merge branch 'develop' into masked_fill

2317773

jeff41404 previously approved these changes Oct 19, 2023

View reviewed changes

luotao1 assigned sunzhongkai588 Oct 19, 2023

sunzhongkai588 reviewed Oct 20, 2023

View reviewed changes

fix doc style

f678e39

AndSonder dismissed jeff41404’s stale review via f678e39 October 20, 2023 05:26

AndSonder requested a review from sunzhongkai588 October 23, 2023 07:38

AndSonder added 2 commits October 23, 2023 02:40

Update manipulation.py

d6b7bd7

Merge branch 'develop' into masked_fill

bccd1f6

sunzhongkai588 approved these changes Oct 24, 2023

View reviewed changes

AndSonder mentioned this pull request Oct 26, 2023

【Hackathon 5th No.3】为 Paddle 新增 masked_fill API PaddlePaddle/docs#6254

Merged

luotao1 merged commit d192740 into PaddlePaddle:develop Nov 2, 2023
28 checks passed

zyfncg mentioned this pull request May 16, 2024

masked_fill_對int64處理異常，塞入paddle.iinfo(paddle.int64).max會被當作min #64365

Closed

		@@ -0,0 +1,187 @@
		# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.

【Hackathon 5th No.3】为 Paddle 新增 masked_fill API #57355

【Hackathon 5th No.3】为 Paddle 新增 masked_fill API #57355

Conversation

AndSonder commented Sep 15, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Sep 15, 2023

AndSonder commented Sep 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndSonder commented Sep 26, 2023

AndSonder commented Oct 11, 2023 • edited Loading

zoooo0820 commented Oct 12, 2023

AndSonder commented Oct 12, 2023 • edited Loading

AndSonder commented Oct 16, 2023

zoooo0820 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeff41404 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sunzhongkai588 left a comment

Choose a reason for hiding this comment

AndSonder commented Oct 26, 2023 • edited Loading

paddle-ci-bot bot commented Nov 1, 2023

AndSonder commented Nov 2, 2023

AndSonder commented Sep 15, 2023 •

edited

Loading

AndSonder commented Sep 21, 2023 •

edited

Loading

AndSonder commented Oct 11, 2023 •

edited

Loading

AndSonder commented Oct 12, 2023 •

edited

Loading

AndSonder commented Oct 26, 2023 •

edited

Loading