Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

take EXIF orientation tags into account when fixing corrupt images #5270

Merged
merged 5 commits into from
Oct 20, 2021

Conversation

jdfr
Copy link
Contributor

@jdfr jdfr commented Oct 20, 2021

I have some images that OpenCV can open just fine, but are detected as corrupt in verify_image_label(). They have, however, EXIF orientation tags that OpenCV handles just fine, but are ignored by Pillow unless using a specific incantation. The result is that some images are transposed without also transposing the labels.

This makes Pillow apply EXIF orientation tags, if present, before saving the image, in order to avoid wreaking havoc on training datasets.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Upgraded image handling with EXIF-aware transposition during verification.

📊 Key Changes

  • ImageOps module from PIL is now imported.
  • An inplace version of the EXIF transpose function is used.
  • Corrupt JPEG images get EXIF transposed before being re-saved.

🎯 Purpose & Impact

  • Purpose: The update is intended to ensure proper orientation of images based on their EXIF data during the image verification process, and repair corrupt JPEG images when encountered.
  • Impact: This enhancement will lead to more accurately oriented images when used within the YOLOv5 framework, providing a better experience for users who rely on the correct visual representation of their data. It might also prevent errors caused by incorrectly oriented images and fix minor corruptions in JPEG files.🔄

@glenn-jocher
Copy link
Member

@jdfr thanks for the PR!

I can't test this locally as I don't have the right kind of corrupted images with exif rotation tags, so I'll have to rely on your results. Are you certain this is 1) addresses your issue and 2) doesn't introduce any new issues?

@glenn-jocher
Copy link
Member

/rebase

jdfr and others added 5 commits October 20, 2021 17:29
We have a local inplace version that is faster than the official as the image is not copied. AutoShape() uses this for Hub models, but here it is not important as the datasets.py usage is infrequent (AutoShape() it is applied every image).
@jdfr
Copy link
Contributor Author

jdfr commented Oct 20, 2021

Well, it works for my use case, but I can't say for sure it's 100% safe (also I see you are very careful about exceptions thrown when handling exif data).

How about this?

                try:
                    ImageOps.exif_transpose(Image.open(im_file)).save(im_file, 'JPEG', subsampling=0, quality=100)
                except:
                    Image.open(im_file).save(im_file, 'JPEG', subsampling=0, quality=100)

or this?

                _img = Image.open(im_file)
                try:
                    _img = ImageOps.exif_transpose(_img)
                except:
                    pass
                _img.save(im_file, 'JPEG', subsampling=0, quality=100)
                del _img

@glenn-jocher
Copy link
Member

@jdfr ok got it. It'll probably be ok, as the entire image/label check is within it's own try: except statement. If there's an error then the image/label pair will just not be used and the user will be notified. If there's a silent error then we won't catch it, but neither would the solutions above.

@glenn-jocher glenn-jocher merged commit 15e8c4c into ultralytics:master Oct 20, 2021
@glenn-jocher
Copy link
Member

@jdfr PR is merged. Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐

BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022
…ltralytics#5270)

* take EXIF orientation tags into account when fixing corrupt images

* fit 120 char

* sort imports

* Update local exif_transpose comment

We have a local inplace version that is faster than the official as the image is not copied. AutoShape() uses this for Hub models, but here it is not important as the datasets.py usage is infrequent (AutoShape() it is applied every image).

* Update datasets.py

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants