[bug fix] potential problem if img fed to model is in rectangular shape #275
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello! I made the PR of bug #266
I also noticed that, in utils/utils.scale_coords, the gain calculation is quite confusing.
It might be incorrect if the image to the model img1 is in rectangular shape but the original image img0 is in squre, or img1 is larger in width but img0 is larger in height.
Actually gain and pad here should be exactly idendity to the r and (dw, dh) in utils/datasets.letterbox, since it's just the reverse operation. So I also modified this part.
But still, there are several round operations in utils/datasets.letterbox, but the reverse operation of utils/utils.scale_coords uses exact float number, which may cause some slight error. So if possible, I hope to change to use ratio_pad in utils/utils.scale_coords to directly set the ratio and pad from utils/datasets.letterbox.
And actually in my own application, I will use some images with very large width/height ratio, and that's why I am focusing on this width/height process. But currently train.py only do preprocess to resize and pad the raw image to square. This is acceptable but not preferrable. If possible, I hope to have the feature of allowing rectangular shaped images for training.
🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Enhancements to image resizing and coordinate scaling logic in YOLOv5's data preprocessing.
📊 Key Changes
new_unpad
calculation inletterbox
function to correctly reflect (height, width) ordering.gain
calculation inscale_coords
to use the minimum scale factor between width and height, rather than the maximum of the two dimensions.🎯 Purpose & Impact
letterbox
change ensures the aspect ratio is maintained during image resizing without padding errors.scale_coords
adjustment provides more accurate bounding box coordinates after scaling, important for object detection accuracy.