Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

a bug about function bbox_iou() #2376

Closed
chongkuiqi opened this issue Mar 5, 2021 · 7 comments 路 Fixed by #2385
Closed

a bug about function bbox_iou() #2376

chongkuiqi opened this issue Mar 5, 2021 · 7 comments 路 Fixed by #2385
Labels
bug Something isn't working Stale

Comments

@chongkuiqi
Copy link

馃悰 Bug

A clear and concise description of what the bug is.
In function bbox_iou() of general.py, if box1 is equal to box2, it should output 1.0, but we get NaN. And the default eps is too small for Half precision.

To Reproduce (REQUIRED)

Input:
(1)
a = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], dtype=torch.float32)
b = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], dtype=torch.float32)
import math
giou = bbox_iou(a.T, b, x1y1x2y2=False, CIoU=True) # giou(prediction, target)
print(giou)

it will output NaN, and if we change a line in bbox_iou(), ''alpha = v / ((1 + eps) - iou + v)'' ==> ''alpha = v / ((1 - iou + v)+eps)'', we will get the right output 1.0;

(2)
a = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], device='cuda:0', dtype=torch.float16)
b = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], device='cuda:0', dtype=torch.float16)
import math
giou = bbox_iou(a.T, b, x1y1x2y2=False, CIoU=True) # giou(prediction, target)
print(giou)

it will output NaN, and if we use 3e-8 as the default eps rather than 1e-9, we will get the right output 1.0;

@chongkuiqi chongkuiqi added the bug Something isn't working label Mar 5, 2021
@glenn-jocher
Copy link
Member

@chongkuiqi thanks for the bug report! This is a very interesting find.

I used this order of operations specifically for speed, as 1 and eps are both scalars and iou and v are both large tensors, so the scalar + scalar operation results in a faster iou computation, but I didn't realize NaN outputs were possible with this. I'll take a look at your proposed solution.

The eps value also may need some adjustment as you found. Good discovery, I will look at this also! test.py generally runs at FP16, and training is mixed, so its possible that real world usage may be passing FP16 boxes in here.

@glenn-jocher
Copy link
Member

@chongkuiqi could you submit a PR with your proposed fix please?

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 6, 2021

I get this profile time for the two implementations:

%timeit alpha = v / (-iou + v + (1 + eps))
11.2 碌s59.9 ns per loop (meanstd. dev. of 7 runs, 100000 loops each)
%timeit alpha = v / (-iou + v + 1 + eps)
16.3 碌s41.7 ns per loop (meanstd. dev. of 7 runs, 100000 loops each)
%timeit alpha = v / ((1 + eps) - iou + v)
16.8 碌s213 ns per loop (meanstd. dev. of 7 runs, 100000 loops each)
%timeit alpha = v / ((1 - iou + v)+eps)
22.7 碌s299 ns per loop (meanstd. dev. of 7 runs, 10000 loops each)

When I try to test FP16 I get a not-implemented error, so I don't think this function is used in FP16 mode. One solution I found that works is updating default eps to 1e-7 to keep the fast alpha computation.

EDIT: fastest implementation on my computer seems to be here. Perhaps coupling this with an eps=1e-7 would work best.

%timeit alpha = v / (v - iou + (1 + eps))
9.28 碌s87.4 ns per loop (meanstd. dev. of 7 runs, 100000 loops each)

@chongkuiqi
Copy link
Author

v / (v - iou + (1 + eps))

Thanks ! It's exciting! I didn't take speed into consideration before. I use "alpha = v / (v - iou + (1 + eps))" and test again.
(1) In FP32 mode,
box1 = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], device='cuda:0', dtype=torch.float32)
box2 = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], device='cuda:0', dtype=torch.float32)
import math
giou = bbox_iou(box1.T, box2, x1y1x2y2=False, CIoU=True, eps=1e-7) # giou(prediction, target)
print(giou)

It will output NaN, and with an eps=1e-7, it works well.
(2) In FP16 mode,
However, I got NaN again using "alpha = v / (v - iou + (1 + eps))" with an eps=1e-7, I set box1 and box2 in FP16 mode explicitly as follows:
box1 = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], device='cuda:0', dtype=torch.float16)
box2 = torch.tensor([[1.42969, 0.25635, 3.78125, 7.37109]], device='cuda:0', dtype=torch.float16)
import math
giou = bbox_iou(box1.T, box2, x1y1x2y2=False, CIoU=True, eps=1e-7) # giou(prediction, target)
print(giou)

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 6, 2021

Yes this is true I did not check FP16, and in general we want eps to be as small as possible to not impact the iou results. Let me see what the datatypes look like during training... ok in my colab tests I added a simple line to check the datatypes, and during GPU training and testing they show as FP32: print(box1.dtype, box2.dtype)

So I'll apply two changes then in a PR: 1) adjust eps to 1e-7 and modify alpha to the faster version, which should speed it up by almost 2X.

@glenn-jocher
Copy link
Member

@chongkuiqi ok I've made a PR!

I think this is probably the best compromise. Another factor is that the function actually doesn't support FP16 on CPU, when I try it I get an error, so all of the usage is FP32 GPU/CPU.

RuntimeError: "clamp_min_cpu" not implemented for 'Half'

@glenn-jocher glenn-jocher removed the TODO label Mar 6, 2021
@glenn-jocher glenn-jocher linked a pull request Mar 6, 2021 that will close this issue
@github-actions
Copy link
Contributor

github-actions bot commented Apr 6, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants