Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about YOLO format in convert_bbox_to_albumentations #883

Closed
christian-cahig opened this issue Apr 25, 2021 · 4 comments
Closed

Comments

@christian-cahig
Copy link

christian-cahig commented Apr 25, 2021

Hi, I was looking through this part of convert_bbox_to_albumentations :

elif source_format == "yolo":
# https://github.com/pjreddie/darknet/blob/f6d861736038da22c9eb0739dca84003c5a5e275/scripts/voc_label.py#L12
bbox, tail = bbox[:4], tuple(bbox[4:])
_bbox = np.array(bbox[:4])
if np.any((_bbox <= 0) | (_bbox > 1)):
raise ValueError("In YOLO format all labels must be float and in range (0, 1]")
x, y, width, height = denormalize_bbox(bbox, rows, cols)
x_min = int(x - width / 2 + 1)
x_max = int(x_min + width)
y_min = int(y - height / 2 + 1)
y_max = int(y_min + height)

Suppose the call x, y, width, height = denormalize_bbox(bbox, rows, cols) yields

  • x = y = 4.0
  • width = height = 3.0

for rows = cols = 5. This would give x_min a value of 3.0 and x_max a value of 6.0, i.e., x_max is greater than cols. The current implementation seems to shift the bbox 1 pixel to the right (and 1 pixel downwards).

With that said, would it be better to replace the definitions of x_min and x_max (and similarly for y_min and y_max) as follows?

x_min = max(int(x - width / 2), 0)
x_max = min(int(x + width / 2 + 1), cols)

Having the + 1 in the calculation of x_max instead of x_min enlarges the bbox by at most 2 pixels to the left and to the right but ensures that the object is still enclosed by the bbox. The max(..., 0) and min(..., cols) ensure that x_min and x_max are within acceptable values.

@christian-cahig
Copy link
Author

With that said, would it be better to replace the definitions of x_min and x_max (and similarly for y_min and y_max) as follows?

x_min = max(int(x - width / 2), 0)
x_max = min(int(x + width / 2 + 1), cols)

Having the + 1 in the calculation of x_max instead of x_min enlarges the bbox by at most 2 pixels to the left and to the right but ensures that the object is still enclosed by the bbox. The max(..., 0) and min(..., cols) ensure that x_min and x_max are within acceptable values.

There's a flaw in this idea: the bounding boxes will only get larger for every transformation. This might not be a good idea especially when using a lot of transformations.

@Dimfred
Copy link

Dimfred commented May 27, 2021

Why do the denormalize and the normalize in the first step?
yolo => albumentations can just be transformed by:

x, y, w, h = bbox # from yolo
w_half, h_half = w / 2, h / 2

x_min, x_max = x - w_half, x + w_half
y_min, y_max = y - h_half, y + h_half

return

that omits the pixel errors in the first place. Is there a particular reason why it is not like this?

@Dipet
Copy link
Collaborator

Dipet commented May 27, 2021

It looks like a bug that needs to be fixed.

@Dipet
Copy link
Collaborator

Dipet commented Jul 7, 2021

Should be fixed by #924

@Dipet Dipet closed this as completed Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants