Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to change annotations indices in memory without changing the dataset locally? #12949

Closed
1 task done
killich8 opened this issue Apr 22, 2024 · 4 comments
Closed
1 task done
Labels
question Further information is requested Stale

Comments

@killich8
Copy link

Search before asking

Question

Hello! 😊

I have a large dataset and I would like to change the annotations before starting training: my dataset is as follows: 0 indicates car, 1 indicates van, 2 indicates bicycle, 3 indicates people, and 4 indicates pedestrian. I would like to change these indices to merge the car class with the van class into a single class (0: car), keep bicycle as 1, and merge the people class with the pedestrian class into the people class (2: people). So, I'm wondering where I can make this change in the code without altering my dataset locally. Is there a way to change these indices in memory?

Thank you 😊

Additional

No response

@killich8 killich8 added the question Further information is requested label Apr 22, 2024
@glenn-jocher
Copy link
Member

Hello! 😊

Absolutely, you can achieve this by tweaking the dataset loading part of the code, specifically in the datasets.py file. Before your data is passed into the training loop, you can adjust the class indices on the fly. Here's a quick snippet to give you an idea:

for label in labels:
    if label[0] == 1:  # Change vans to cars
        label[0] = 0
    elif label[0] == 3 or label[0] == 4:  # Combine people and pedestrian classes
        label[0] = 2
    # Adjust indices for other classes accordingly

Place this snippet right after the labels for your images are loaded and before they are used for training. This way, you modify the annotations in memory without altering your dataset locally.

Remember, changes made this way are not permanent and will reset each time the data is loaded for training. Ensure this adjustment aligns with your data handling policies and practices.

Happy coding! 😊

@killich8
Copy link
Author

Hello @glenn-jocher

Thank you very much for your response.
I can't find the file datasets.py

@glenn-jocher
Copy link
Member

Hello! 😊

My apologies for any confusion. The correct file to look for modifications would be in your YOLOv5 setup; it's likely named slightly differently or the functionality could be encapsulated somewhere within the data loading and preprocessing mechanisms.

For adjusting labels in YOLOv5, you'd typically look into the dataset loading section, which as of the latest versions, involves modifying the behavior within the load function that pertains to dataset handling, potentially in files like load.py or similar.

If you're navigating the latest structure and still can't locate the precise spot for this adjustment, I'd recommend reviewing the documentation or exploring the source where datasets are loaded and preprocessed. Understanding how data flows through these starting points will give you a clear indication of where to implement the class index adjustments.

Keep the exploration going, and you're doing great! 😊

Copy link
Contributor

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale label May 23, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

2 participants