Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

masked_fill_對int64處理異常,塞入paddle.iinfo(paddle.int64).max會被當作min #64365

Closed
anderson101866 opened this issue May 16, 2024 · 13 comments
Assignees

Comments

@anderson101866
Copy link

anderson101866 commented May 16, 2024

bug描述 Describe the Bug

如下單純的script,帶入int64最大值時,會變成最小值

import paddle; print(paddle.__version__) #2.6.0

t = paddle.zeros((2,2), dtype=paddle.int64)
print(paddle.iinfo(paddle.int64).max == 2**63-1) #True
t.masked_fill_(paddle.to_tensor([[0,0],[0,1]]), 2**63-1) #<--------------
print(t) 
#Tensor(shape=[2, 2], dtype=int64, place=Place(gpu:1), stop_gradient=True,
#       [[ 0                  ,  0                  ],
#        [ 0                  , -9223372036854775808]])

其他补充信息 Additional Supplementary Information

No response

@zyfncg
Copy link
Contributor

zyfncg commented May 16, 2024

@AndSonder 这个问题能否辛苦看下呢?

OP来源:#57355

@AndSonder
Copy link
Contributor

AndSonder commented May 16, 2024

@zyfncg 看起来是 full op 里面的 bug,如下代码会产生溢出

>>> paddle.full([], 2**63-1, paddle.int64)
Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
       -9223372036854775808)

@AndSonder
Copy link
Contributor

AndSonder commented May 16, 2024

@anderson101866 您可以通过如下代码实现想要的功能

import paddle; print(paddle.__version__) #2.6.0

t = paddle.zeros((2,2), dtype=paddle.int64)
print(paddle.iinfo(paddle.int64).max == 2**63-1) #True
val = paddle.to_tensor(2**63-1, paddle.int64)
t.masked_fill_(paddle.to_tensor([[0,0],[0,1]]), val) #<--------------
print(t) 
#Tensor(shape=[2, 2], dtype=int64, place=Place(gpu:0), stop_gradient=True,
#       [[0                  , 0                  ],
#        [0                  , 9223372036854775807]])

@zyfncg
Copy link
Contributor

zyfncg commented May 16, 2024

@zyfncg 看起来是 full op 里面的 bug,如下代码会产生溢出

>>> paddle.full([], 2**63-1, paddle.int64)
Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
       -9223372036854775808)

👍

这样的话paddle.full是否可以替换为to_tensor呢?

@AndSonder
Copy link
Contributor

@zyfncg 看起来是 full op 里面的 bug,如下代码会产生溢出

>>> paddle.full([], 2**63-1, paddle.int64)
Tensor(shape=[], dtype=int64, place=Place(gpu:0), stop_gradient=True,
       -9223372036854775808)

👍

这样的话paddle.full是否可以替换为to_tensor呢?

可以替换,但是是不是还是修复 full 的这个 bug 会好一些,要不还有可能有其他 api 有类似的没发现的越界问题

@anderson101866
Copy link
Author

anderson101866 commented May 16, 2024

類似的問題我也有碰到 也很像overflow的問題

import paddle; print(paddle.__version__) #2.6.0
x = paddle.to_tensor([[0, 0], 
                      [-2**63, 0]], dtype=paddle.int64)
print(x) 
x == 2**63-1 # __eq__
#Tensor(shape=[2, 2], dtype=bool, place=Place(gpu:0), stop_gradient=True,
#       [[False, False],
#        [True , False]])

一樣是max會被當成min

@AndSonder
Copy link
Contributor

類似的問題我也有碰到 也很像overflow的問題

import paddle; print(paddle.__version__) #2.6.0
x = paddle.to_tensor([[0, 0], 
                      [-2**63, 0]], dtype=paddle.int64)
print(x) 
x == 2**63-1 # __eq__
#Tensor(shape=[2, 2], dtype=bool, place=Place(gpu:0), stop_gradient=True,
#       [[False, False],
#        [True , False]])

一樣是max會被當成min

这个应该是数值溢出导致的,也有可能是类似的原因导致的

@zyfncg
Copy link
Contributor

zyfncg commented May 17, 2024

image

进一步分析了下,通过pybind将Python数据类型转到C++时2**63-1会被识别为float类型,而2**63-12**63的浮点表示是相同的,所以在C++层转回int64类型时解析为2**63就会出现精度溢出的问题。

这个问题看起来不是很好解决,对于临界值的处理经常会有风险,如果有绕过方式建议先避开这里的逻辑。

@AndSonder
Copy link
Contributor

image 进一步分析了下,通过pybind将Python数据类型转到C++时`2**63-1`会被识别为float类型,而`2**63-1`和`2**63`的浮点表示是相同的,所以在C++层转回int64类型时解析为`2**63`就会出现精度溢出的问题。

这个问题看起来不是很好解决,对于临界值的处理经常会有风险,如果有绕过方式建议先避开这里的逻辑。

这个 float 会不会是这里导致的?

image

@zyfncg
Copy link
Contributor

zyfncg commented May 17, 2024

验证了下,确实是这里导致的

@zyfncg
Copy link
Contributor

zyfncg commented May 17, 2024

还有一个问题就是在静态图模式下这里的value也只能用float类型表示,动态图的类型即使转为int64,在动转静后还是会遇到这里精度溢出的问题

@Xreki
Copy link
Contributor

Xreki commented May 30, 2024

这个问题,早期的版本修复过,会使用str_value来传递值,以下是2.2分支中的实现:

auto str_value = ctx.Attr<std::string>("str_value");
auto float_value = ctx.Attr<float>("value");
auto force_cpu = ctx.Attr<bool>("force_cpu");
auto place_type = ctx.Attr<int>("place_type");
framework::Tensor *tensor = nullptr;
framework::Variable *out_var = ctx.OutputVar("Out");
T value;
if (str_value.empty()) {
value = static_cast<T>(float_value);
} else {
// handle NaN/Inf first, which cannot be read from stream.
if (str_value == "inf") {
value = static_cast<T>(std::numeric_limits<double>::infinity());
} else if (str_value == "-inf") {
value = static_cast<T>(-std::numeric_limits<double>::infinity());
} else if (str_value == "nan") {
value = static_cast<T>(std::numeric_limits<double>::quiet_NaN());
} else {
std::stringstream convert_stream(str_value);
if (std::is_same<int64_t, T>::value) {
int64_t tmp_value;
convert_stream >> tmp_value;
value = static_cast<T>(tmp_value);
} else {
double tmp_value;
convert_stream >> tmp_value;
value = static_cast<T>(tmp_value);
}
}
}

if not isinstance(value, Variable):
if dtype in ['uint8', 'int16', 'int32', 'int64']:
attrs['str_value'] = str(int(value))
attrs['value'] = int(value)
else:
attrs['str_value'] = str(float(value))
attrs['value'] = float(value)

@anderson101866
Copy link
Author

anderson101866 commented Jul 4, 2024

與百度溝通後,Kai Wang表示現在這個工作沒有排期,NV目前沒有blocking。
先close

附帶,對int64的操作,可能也有其他API有這種精度溢出問題,

比如相等__eq__,同樣導致過大/過小的數字會被當成相同的數字

import paddle; print(paddle.__version__) #2.6.0
x = paddle.to_tensor([[0, 0], 
                      [-2**63, 0]], dtype=paddle.int64)
print(x) 
x == 2**63-1 # __eq__
#Tensor(shape=[2, 2], dtype=bool, place=Place(gpu:0), stop_gradient=True,
#       [[False, False],
#        [True , False]])

@anderson101866 anderson101866 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 4, 2024
@paddle-bot paddle-bot bot added the status/close 已关闭 label Jul 4, 2024
@paddle-bot paddle-bot bot closed this as completed Jul 4, 2024
@paddle-bot paddle-bot bot removed the status/new-issue 新建 label Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants