Memory Error Corrupt JPEG data: 2 extraneous bytes before marker 0xd9 #8908

blackShine-2 · 2022-08-09T01:36:22Z

Search before asking

I have searched the YOLOv5 issues and discussions and found no similar questions.

Question

I have successfully used YOLOv5 model in my other dataset. However, with this particular dataset, I am getting the following error.
Memory Error Corrupt JPEG data: 2 extraneous bytes before marker 0xd9
Please help me to solve this.

Additional

No response

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2022-08-09T20:54:05Z

@blackShine-2 seems like something is wrong with some of your JPEGs. We try to preprocess the dataset and indicate problem images but it seems that your images are causing errors.

github-actions · 2022-09-09T00:26:35Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com/hub
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

unrue · 2024-04-12T07:41:41Z

Is it possible to retrieve the image name? I have a similar problem during training:

197/199 51.6G 0.01721 0.02663 0.01128 1141 640: 56%|█████▌ | 170/305 [02:00<01:36, 1.41it/s]Corrupt JPEG data: 2 extraneous bytes before marker 0xd6 197/199 51.6G 0.01725 0.0267 0.01131 1196 640: 63%|██████▎ | 191/305 [02:15<01:21, 1.41it/s]Corrupt JPEG data: 6 extraneous bytes before marker 0xd1 197/199 51.6G 0.01725 0.02669 0.01132 1049 640: 65%|██████▍ | 198/305 [02:20<01:16, 1.41it/s]Corrupt JPEG data: 6 extraneous bytes before marker 0xd1 197/199 51.6G 0.01726 0.02668 0.01133 1082 640: 82%|████████▏ | 249/305 [02:57<00:41, 1.34it/s]Corrupt JPEG data: 2 extraneous bytes before marker 0xd5

But I don't know the images involved in a dataset with 30k images.

glenn-jocher · 2024-04-12T16:31:32Z

Hello! 👋 It sounds like you're encountering a common issue where specific training images may be corrupted. Unfortunately, YOLOv5's training logs don't directly output the names of corrupt images during training.

To identify the corrupt images, you might consider running a separate script before training that checks each image's integrity. Here's a quick Python snippet that could help:

import os
from PIL import Image

def check_images(folder_path):
    for root, dirs, files in os.walk(folder_path):
        for file in files:
            if file.endswith('.jpg') or file.endswith('.jpeg'):
                file_path = os.path.join(root, file)
                try:
                    img = Image.open(file_path)  # Open the image file
                    img.verify()  # Verify that it's a valid image
                except (IOError, SyntaxError) as e:
                    print(f'Corrupt image found: {file_path}')

check_images('/path/to/your/dataset')

Replace '/path/to/your/dataset' with the actual path to your dataset. This script will print the paths of corrupt JPEG images, which you can then review or remove from your dataset to prevent these errors during training.

Hope this helps! 🙂

unrue · 2024-04-15T08:01:34Z

Hi Glenn,

thanks for the reply. I already did such ckeck, and no images are corrupted. However, during training, I still get the Corrupted error.

glenn-jocher · 2024-04-15T14:17:28Z

Hi there!

Thanks for running the checks! If the images appear fine but the error persists, it could be related to a transient issue during the training data loading. A quick workaround could be to catch and handle exceptions within the dataset loading process to skip over problematic images. While not ideal, this can help continue training without interruption. Here’s a snippet that could get you started if you're customizing the data loader:

from PIL import Image
def safe_open(path):
    try:
        img = Image.open(path)
        img.verify()  # Verify the integrity
        img.close()
        img = Image.open(path)  # Open it again as verify() closes the file
        return img
    except (IOError, SyntaxError):
        print(f'Corrupt image skipped: {path}')
        return None  # or a placeholder image of your choice

Use this safe_open function to open images in the dataset loader where images are retrieved. This way, if an image is corrupt, it gets skipped with a warning rather than halting the training.

Remember, this is a workaround. It’s always best to investigate and resolve the root cause of corrupted data if possible. 🛠️

Cheers!

blackShine-2 added the question Further information is requested label Aug 9, 2022

github-actions bot added the Stale label Sep 9, 2022

github-actions bot closed this as completed Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Error Corrupt JPEG data: 2 extraneous bytes before marker 0xd9 #8908

Memory Error Corrupt JPEG data: 2 extraneous bytes before marker 0xd9 #8908

blackShine-2 commented Aug 9, 2022

glenn-jocher commented Aug 9, 2022

github-actions bot commented Sep 9, 2022 •

edited by glenn-jocher

Loading

unrue commented Apr 12, 2024 •

edited

Loading

glenn-jocher commented Apr 12, 2024

unrue commented Apr 15, 2024

glenn-jocher commented Apr 15, 2024

Memory Error Corrupt JPEG data: 2 extraneous bytes before marker 0xd9 #8908

Memory Error Corrupt JPEG data: 2 extraneous bytes before marker 0xd9 #8908

Comments

blackShine-2 commented Aug 9, 2022

Search before asking

Question

Additional

glenn-jocher commented Aug 9, 2022

github-actions bot commented Sep 9, 2022 • edited by glenn-jocher Loading

unrue commented Apr 12, 2024 • edited Loading

glenn-jocher commented Apr 12, 2024

unrue commented Apr 15, 2024

glenn-jocher commented Apr 15, 2024

github-actions bot commented Sep 9, 2022 •

edited by glenn-jocher

Loading

unrue commented Apr 12, 2024 •

edited

Loading