Random Crop yields incorrect value for bounding box #903

bw4sz · 2021-05-21T23:48:23Z

🐛 Bug

The bbox_random_crop function does not produce a reasonable result. Consider the following snippet of code.

To Reproduce

from albumentations import functional as F
bbox = (0.129, 0.7846, 0.1626, 0.818)
cropped_bbox = F.bbox_random_crop(bbox=bbox, crop_height=100, crop_width=100, h_start=0.7846, w_start=0.12933, rows=1500, cols=1500)
cropped_bbox
(0.125, 0.788999, 0.629, 1.29)
#Notice y2 is outside of crop range.
#But the following assert passes
assert bbox[3] < (100/1500) + 0.7846
#Fails
assert all([(y >= 0) & (y<=1) for y in list(cropped_bbox)])

Expected behavior

A augmented bbox that is fully within the image crop. The crop_height plus the start of the crop is larger than the y2 of the bounding box, but 1.29 coordinate in the cropped box suggestion it is outside of the crop area.

Environment

Albumentations version (e.g., 0.1.8): 0.5.2
Python version (e.g., 3.7): 3.7
OS (e.g., Linux): OSX
How you installed albumentations (conda, pip, source): pip
Any other relevant information:

Additional context

I am making a custom augmentation to Zoom in on a given bounding box. CropSafe (but not all boxes). Is there syntax that i'm misunderstanding, it doesn't feel like this could be the case. Dtype issue?

The text was updated successfully, but these errors were encountered:

bw4sz · 2021-05-22T01:49:28Z

I wanted to come back and give a simpler example to fully zoom in on the syntax that I must not be understanding.

This passes

If you have a box at 100/1000 and make a 200 pixel crop starting at 0, the final bbox coordinate should be 0.5

bbox = (0, 0, 0.1,0.1)
crop_height = 200
rows = 1000
h_start = 0
w_start = 0
cropped_bbox = F.bbox_random_crop(bbox=bbox, crop_height=crop_height, crop_width=crop_height, h_start=h_start, w_start=w_start, rows=rows, cols=rows)
cropped_bbox

assert cropped_bbox[3] == bbox[3] / ((crop_height/rows))
assert all([(y >= 0) & (y<=1) for y in list(cropped_bbox)])

If you shift everything over 100 pixels, the logic holds.

bbox = (0.1, 0.1, 0.2,0.2)
crop_height = 200
rows = 1000
h_start = 0
w_start = 0
cropped_bbox = F.bbox_random_crop(bbox=bbox, crop_height=crop_height, crop_width=crop_height, h_start=h_start, w_start=w_start, rows=rows, cols=rows)
cropped_bbox

assert cropped_bbox[3] == bbox[3] / ((crop_height/rows))
assert all([(y >= 0) & (y<=1) for y in list(cropped_bbox)])

but if you have begin the crop at any position besides 0, you get a strange result?

bbox = (0.1, 0.1, 0.2,0.2)
crop_height = 200
rows = 1000
h_start = 0.1
w_start = 0.1
cropped_bbox = F.bbox_random_crop(bbox=bbox, crop_height=crop_height, crop_width=crop_height, h_start=h_start, w_start=w_start, rows=rows, cols=rows)
cropped_bbox

cropped_bbox
(0.1, 0.1, 0.6, 0.6)

The size of the box is correct, but the final bbox start coordinate should be 0,0 since, h_start == bbox[0] and w_start == bbox[1]?

Digging into the function, you can see that 'get_random_crop_coords' produces a non-unintuitive result?

F.get_random_crop_coords(height=1000, width=1000, crop_height=200, crop_width=200, h_start=0.1, w_start=0.1)
(80, 80, 280, 280)

I would expect a 200 pixel crop starting at 0.1 with an original image size of 1000 to have crop coordinates of 100,300, not 80,280. Where is the 20 offset coming from?

Going one layer deeper to

albumentations/albumentations/augmentations/crops/functional.py

Line 14 in cc8fbb6

    
           def get_random_crop_coords(height: int, width: int, crop_height: int, crop_width: int, h_start: float, w_start: float):

y1 = int((height - crop_height) * h_start)

Why is this not

y1 = int(height  * h_start)

The size of the crop does not effect the location of the min coordinate, only the location of the max coordinate. Any crop starting at 0.1 of a 1000 row image should have a y1 coordinate of 100, regardless of crop size?

Dipet · 2021-05-23T16:14:29Z

Because if we will use y1 = int(height * h_start) we may get crop less then described crop size. For example h_start=0.9 and crop_height=200 when rows=1000

bw4sz · 2021-05-23T18:39:20Z

Sure, that's definitely true. But maybe just raise a value error there instead? It creates a disconnect between the function goal (return the coordinates from a crop of a given size and starting position) and the output. For example, imagine a user needs a crop at a specific location, this function makes it difficult to get the correct position. I had to debug for a few hours to get to this. As a compromise can we atleast comment the code in bbox_random_crop to explicitly say that the output coordinates are relative to height - crop_height? I don't think anyone could have anticipated this. I'm guessing that there are other pieces of logic in the codebase that have similar structures. I forked and changed that line, but it yielded strange boxes.

I can open a PR on the wider goal of create a ZoomSafe augmentation. I think it will be of use to a wide audience. Its somewhere between RandomResizeCropSafe and Random_Crop. The safe aspect is important because torchvision detection models do not accept blank annotations.

Dipet · 2021-05-23T22:47:53Z

But maybe just raise a value error there instead?

Bad idea, because we have transforms that used current logic and these changes will break their functionality.

As a compromise can we atleast comment the code in bbox_random_crop to explicitly say that the output coordinates are relative to height - crop_height?

I agree, it would be great to add explanation of how this function works.

I can open a PR on the wider goal of create a ZoomSafe augmentation. I think it will be of use to a wide audience. Its somewhere between RandomResizeCropSafe and Random_Crop. The safe aspect is important because torchvision detection models do not accept blank annotations.

It would be great. We will wait for your PR.

Dimfred · 2021-05-26T17:04:18Z

Btw padding an image produces also wrong bounding boxes. All bboxes are shifted to the left and to the top on the padded image.

bw4sz · 2021-05-26T17:11:14Z

Can you show an example, i've been running into similar problems.

…

On Wed, May 26, 2021 at 10:04 AM Dimfred ***@***.***> wrote: Btw padding an image produces also wrong bounding boxes. All bboxes are shifted to the left and to the top on the padded image. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#903 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJHBLCLXXTEXPPFQSBDVOTTPUS2HANCNFSM45KA5YCA> .

-- Ben Weinstein, Ph.D. Postdoctoral Fellow University of Florida http://benweinstein.weebly.com/

Dimfred · 2021-05-26T18:52:05Z

Sure, so the first image is one from the padded, and the second one from an unpadded case. The green bounding box is the ground truth.
Just for clarification the red box is a prediction, just ignore it. The differences of the two green boxes are important.

Super bad for me have to redo 10 days of grid search for my thesis, super annoying...

With padding

Without padding

There are like 3 - 4 pixels difference.

Dimfred · 2021-05-26T22:27:19Z

So far I see in bbox_utils my yolo coordinates get first denormalized to absolute and then renormalized to albumentations format, in that process some int trimming is happening which seems to cause those issues, I could simply normalize without denormalizing by using the yolo format as it is and converting in the float domain.
Seems like an extra step anyway.
I also noticed that SafeRotate seems also to be broken.

bw4sz · 2021-05-26T23:44:46Z

I can see similar behavior

        transform = A.Compose([            A.PadIfNeeded(min_height=600,min_width=600, border_mode=cv2.BORDER_CONSTANT, value=0),                        A.pytorch.ToTensorV2()        ])

@Dipet I will work on a reproducible example tomorrow. I believe all of these issues are related to the normalization. I am using pascal, not yolo. I will rerun with

        transform = A.Compose([
            A.PadIfNeeded(min_height=600,min_width=600, border_mode=cv2.BORDER_CONSTANT, value=0),            
            A.pytorch.ToTensorV2()
        ], A.BboxParams(format='pascal_voc'))

Dimfred · 2021-05-27T07:06:09Z

Are you also transforming your bounding boxes? Or only the image? Because your result looks like you are not transforming the bounding boxes, or you don't put the class to the back. Your results are like really broke, but look at mine the bounding box is shifted only some pixels.

Dimfred · 2021-05-27T09:25:54Z

So I fixed the errors for me by changing the conversion in bbox_utils for the yolo case.
Since yolo comes in already relative a conversion to absolute format and then back to relative will yield errors.

So instead of convertig to absolute I removed it and now with the coordinates I get from yolo

x, y, w, h = yolo_bbox

w_half, h_half = w / 2, h / 2
x_min, x_max = x - w_half, x + w_half
y_min, y_max = y - h_half, y + h_half

and vice versa in the from_albumentation case.

Dipet · 2021-07-07T21:19:07Z

Should be fixed by #924

veve90 · 2022-06-02T20:56:37Z

So I fixed the errors for me by changing the conversion in bbox_utils for the yolo case. Since yolo comes in already relative a conversion to absolute format and then back to relative will yield errors.

So instead of convertig to absolute I removed it and now with the coordinates I get from yolo
x, y, w, h = yolo_bbox

w_half, h_half = w / 2, h / 2
x_min, x_max = x - w_half, x + w_half
y_min, y_max = y - h_half, y + h_half
and vice versa in the from_albumentation case.

Could you please tell us in what function you propose this change? Thank you!

Dimfred · 2022-06-02T20:59:14Z

@veve90 bae5414#diff-78866cc64c23c9e242bdcb84933e35a5218dda5b971f3f3d9f8348d5af855cd0R233

bw4sz mentioned this issue May 27, 2021

PadifNeeded does not correctly transform boxes #909

Closed

This was referenced Jun 15, 2021

ValueError: Expected x_max for bbox to be in the range [0.0, 1.0], got 1.0025. #922

Closed

YOLO format without normalization and denormalization #924

Merged

aliceinland mentioned this issue Jul 5, 2021

Expected x_max for bbox (0.4358974358974359, 0.30288461538461536, 1.061965811965812, 0.9663461538461539, 13) to be in the range [0.0, 1.0], got 1.061965811965812. #947

Closed

Dipet closed this as completed Jul 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random Crop yields incorrect value for bounding box #903

Random Crop yields incorrect value for bounding box #903

bw4sz commented May 21, 2021

bw4sz commented May 22, 2021 •

edited

Loading

Dipet commented May 23, 2021

bw4sz commented May 23, 2021

Dipet commented May 23, 2021

Dimfred commented May 26, 2021

bw4sz commented May 26, 2021 via email

Dimfred commented May 26, 2021 •

edited

Loading

Dimfred commented May 26, 2021 •

edited

Loading

bw4sz commented May 26, 2021

Dimfred commented May 27, 2021 •

edited

Loading

Dimfred commented May 27, 2021

Dipet commented Jul 7, 2021

veve90 commented Jun 2, 2022

Dimfred commented Jun 2, 2022

Random Crop yields incorrect value for bounding box #903

Random Crop yields incorrect value for bounding box #903

Comments

bw4sz commented May 21, 2021

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

bw4sz commented May 22, 2021 • edited Loading

Dipet commented May 23, 2021

bw4sz commented May 23, 2021

Dipet commented May 23, 2021

Dimfred commented May 26, 2021

bw4sz commented May 26, 2021 via email

Dimfred commented May 26, 2021 • edited Loading

Dimfred commented May 26, 2021 • edited Loading

bw4sz commented May 26, 2021

Dimfred commented May 27, 2021 • edited Loading

Dimfred commented May 27, 2021

Dipet commented Jul 7, 2021

veve90 commented Jun 2, 2022

Dimfred commented Jun 2, 2022

bw4sz commented May 22, 2021 •

edited

Loading

Dimfred commented May 26, 2021 •

edited

Loading

Dimfred commented May 26, 2021 •

edited

Loading

Dimfred commented May 27, 2021 •

edited

Loading