-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset upload fails with large number of images #42
Comments
👋 Hello @barney2074, thank you for raising an issue about Ultralytics HUB 🚀! Please visit https://ultralytics.com/hub to learn more, and see our ⭐️ HUB Guidelines to quickly get started uploading datasets and training YOLOv5 models. If this is a 🐛 Bug Report, please provide screenshots and steps to recreate your problem to help us get started working on a fix. If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response. We try to respond to all issues as promptly as possible. Thank you for your patience! |
@barney2074 thanks for the bug report! We will take a look here and see what might be wrong. |
@barney2074 to answer your other question only models trained on HUB can be used with the app currently. |
thanks Glenn Since posting, I've been trying a few options to narrow down the problem- initially I thought it might be invalid labels (my images are synthetic and some labels are full-frame, depending on randomization of camera position) but seems this is not the case My filenames have extra periods in them- I also thought this could be a problem- but the small test dataset that works also has this I can send you a small-ish dataset that fails if it helps Andrew |
@barney2074 we've found an issue on our side related to large dataset uploads as you said. We will keep you updated when a fix has been implemented. |
@barney2074 we are still working on preventing the same issue from happening again. For now I have manually forced processing of the dataset in your screenshot which you can access with your account at this link: |
thanks @kalenmike I also can't see the dataset in the 'train a model' page But that is fine- I really appreciate your quick response and it seems to be a bug, rather than me doing something wrong- so no hurry. Andrew |
@barney2074 Perhaps you are using multiple accounts with the HUB? As we are lucky enough to only have two datasets with the name 'constructaiv7' I was able to track them down to the same user, one is now active and visible while the other is still in a failed state. Unfortunately I am not able to provide more info than that. I will keep you updated and once we resolve the issue that caused the bug you can also try a new upload. |
Hi @kalenmike Yes, sorry, my error- I had inadvertently created 2 accounts Andrew |
@barney2074 Perfect! We have addressed the bug that you found and it seems to be resolved. Please let us know if you still have issues with uploading the larger dataset. |
thanks @kalenmike - I'll give it a try |
Hi @kalenmike Sorry to trouble you- but I'm still having the same problem It seems to sit at the pulsing yellow 'processing' button for a couple of hours, then says it has failed. the v1.1 dataset only has 40 images- it's been processing for at least an hour Andrew |
@barney2074 Ok thanks. I will look into the logs and see what is going wrong. |
@barney2074 I am seeing that your new datasets have been failing because the YAML is incorrectly formatted. It looks like you have duplicated the This can help to ensure that the YAML is correctly formatted: |
Hi @kalenmike Sorry...! I must have checked that 10 times, but just couldn't see the obvious error That said- it would be great to have either some detailed error reporting, or the ability to load/validate YAML, images & labels separately (kinda like Roboflow, where stuff is uploaded separately, then gets matched up) Andrew |
@barney2074 Thanks for the feedback. I agree as humans we are prone to error, we are planning to improve the error reporting as well as to auto repair detected errors to prevent re-upload. |
@barney2074 @kalenmike duplicate names key is incorrect YAML but it doesn't seem to error with the YOLOv5 YAML loaders as far as I can tell. I tested training and dataset_stats() with duplicate names fields (identical and different names) and both still work correctly for me. Regardless though it helps to examine your YAMLs with an IDE that highlights errors like PyCharm (note red underlines): |
@glenn-jocher The error was because the names key was inputted like this |
Hi @kalenmike sorry- this was caused by a dumb copy/paste error on my part...! I'm wondering if part of the solution might be to complete YAML parsing/validation as soon as the ZIP dataset is transferred- it seems to take a long time on 'processing' to time out. I imagine my earlier response i.e uploading in batches then processing it together would complicate the process considerably Andrew |
@kalenmike got it. I've opened ultralytics/yolov5#8192 to better report yaml load errors like the one above would probably cause, please review. |
Search before asking
Question
Hello,
I am having a problem uploading a dataset:
i.e in the first screenshot below constructaiv6 & constructaiv7 are identical, except for the number of images and labels
I'm loving yolov5 and finding it very easy to use- but would like to try the hub option.
(A separate question- but I was wondering if there was any way to use a locally trained model on the phone app ?)
I'd be very grateful for any help with this
Andrew
Additional
No response
The text was updated successfully, but these errors were encountered: