-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset Upload Failed with Multiple YAML's found #89
Comments
👋 Hello @PanterSoft, thank you for raising an issue about Ultralytics HUB 🚀! Please visit https://ultralytics.com/hub to learn more, and see our ⭐️ HUB Guidelines to quickly get started uploading datasets and training YOLOv5 models. If this is a 🐛 Bug Report, please provide screenshots and steps to recreate your problem to help us get started working on a fix. If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response. We try to respond to all issues as promptly as possible. Thank you for your patience! |
@PanterSoft hi I saw this issue was resolved on your side? Should we close the issue? |
Sorry was a mistake I still have the same problem. |
I found the issue when you create a zip file on MacOS it creates a hidden folder in the file which contains the same files which then gives the error multiple yaml files. solved it by uploading dataset from Windows |
@nicomattes Thanks for letting us know. We will look into improving this for zips created with MacOS. |
I have the same problem. |
@mehlkelm hi! Could you explain in more detail how the issue arises or show us screenshots of what your directory structure looks like that is causing problems? |
The issue arises (I think) because macOS creates a hidden (for mac users) directory "__MACOSX" inside the zip file that contains additional information about all the files. To a non-mac OS it looks like everything is duplicated. Maybe the dataset upload tool should just ignore the __MACOSX folder in there. |
@mehlkelm yeah good point. We definitely want a macOS-robust tool. Could you upload a small zip that's crashing HUB here or email it to glenn.jocher@ultralytics.com? Thanks! |
This is a completely made up example but it shows the problem. |
@mehlkelm thanks! We'll use the zip to debug and improve the dataset upload process. |
When unzipping I see the (venv39) glennjocher@Glenns-MacBook-Air Downloads % unzip macos_zip_problem_demo.zip
Archive: macos_zip_problem_demo.zip
creating: macos_zip_problem_demo/
inflating: macos_zip_problem_demo/.DS_Store
inflating: __MACOSX/macos_zip_problem_demo/._.DS_Store
creating: macos_zip_problem_demo/images/
inflating: macos_zip_problem_demo/macos_zip_problem_demo.yaml
inflating: __MACOSX/macos_zip_problem_demo/._macos_zip_problem_demo.yaml
creating: macos_zip_problem_demo/labels/
inflating: macos_zip_problem_demo/images/.DS_Store
inflating: __MACOSX/macos_zip_problem_demo/images/._.DS_Store
creating: macos_zip_problem_demo/images/train/
creating: macos_zip_problem_demo/images/val/
inflating: macos_zip_problem_demo/labels/.DS_Store
inflating: __MACOSX/macos_zip_problem_demo/labels/._.DS_Store
creating: macos_zip_problem_demo/labels/train/
creating: macos_zip_problem_demo/labels/val/
inflating: macos_zip_problem_demo/images/train/IMG_1931.jpeg
inflating: __MACOSX/macos_zip_problem_demo/images/train/._IMG_1931.jpeg
inflating: macos_zip_problem_demo/images/train/IMG_1925.jpeg
inflating: __MACOSX/macos_zip_problem_demo/images/train/._IMG_1925.jpeg
inflating: macos_zip_problem_demo/images/train/IMG_1923.jpeg
inflating: __MACOSX/macos_zip_problem_demo/images/train/._IMG_1923.jpeg
inflating: macos_zip_problem_demo/images/train/IMG_1915.jpeg
inflating: __MACOSX/macos_zip_problem_demo/images/train/._IMG_1915.jpeg
inflating: macos_zip_problem_demo/images/val/IMG_1927.jpeg
inflating: __MACOSX/macos_zip_problem_demo/images/val/._IMG_1927.jpeg
inflating: macos_zip_problem_demo/images/val/IMG_1914.jpeg
inflating: __MACOSX/macos_zip_problem_demo/images/val/._IMG_1914.jpeg
inflating: macos_zip_problem_demo/labels/train/IMG_1915.txt
inflating: __MACOSX/macos_zip_problem_demo/labels/train/._IMG_1915.txt
inflating: macos_zip_problem_demo/labels/train/IMG_1923.txt
inflating: __MACOSX/macos_zip_problem_demo/labels/train/._IMG_1923.txt
inflating: macos_zip_problem_demo/labels/train/IMG_1925.txt
inflating: __MACOSX/macos_zip_problem_demo/labels/train/._IMG_1925.txt
inflating: macos_zip_problem_demo/labels/train/IMG_1931.txt
inflating: __MACOSX/macos_zip_problem_demo/labels/train/._IMG_1931.txt
inflating: macos_zip_problem_demo/labels/val/IMG_1914.txt
inflating: __MACOSX/macos_zip_problem_demo/labels/val/._IMG_1914.txt
inflating: macos_zip_problem_demo/labels/val/IMG_1927.txt
inflating: __MACOSX/macos_zip_problem_demo/labels/val/._IMG_1927.txt We need to update to be robust to this. I'll look into it. |
All the files in __MACOSX are hidden, as are the .DS_Store files, which are also some sort of OS specific info files. Maybe ignoring hidden files in general would be enough… |
(according to the unix way of treating files with names starting with a dot as hidden) |
@mehlkelm actually this is really strange, on my mac if I unzip using the GUI/mouse commands everything unzips correctly into 1 directory. If I unzip using the terminal I get two top level directories:
|
I guess the regular mac unzip processes the information in __MACOSX (technically they are resource forks) and doesn't interpret them as files, while all the non Mac tools/OS (even unzip in the terminal on mac?) think they are files. |
but yes, it's silly that the mac puts them there |
Its ok, we can't change macOS but we can change HUB. I'll let you know when this is resolved. |
@kalenmike I'm not able to produce any errors with from utils.dataloaders import HUBDatasetStats
stats = HUBDatasetStats('/Users/glennjocher/Downloads/sandbox/macos_zip_problem_demo.zip')
stats.get_json()
stats.process_images() Initially directory contained only zip: After unzip contained the unzipped dir plus an extra __MAXOSX dir with duplicated data: But the actual dataset processing seemed to work without any duplicate YAML errors. Could you check HUB to see if the error is produced downstream of these steps? |
@kalenmike @mehlkelm I've opened ultralytics/yolov5#9843 in YOLOv5 to better handle unzipping while rejecting files in an exclude list, i.e. .DS_Store and __MACOSX instances in file paths. |
@glenn-jocher There is independent validation on the server that is not excluding the MACOSX folder by the looks of it. |
@kalenmike ok got it, I'll let you try to debug on the server side. |
This is now resolved and zip files created on Mac can now be processed successfully without special zipping. |
@kalenmike thanks for the fix!! @mehlkelm tested today and everything works now. Removing TODO. Let us know if you find any other issues or think of features you'd like to see! |
Search before asking
HUB Component
No response
Bug
I am trying to upload a Dataset wich has the exact same structure as the tutorial suggests the yaml file has the correct format and the labels only consists of numbers.
After that was not working I tried the example dataset coco6 and got the same result shown in the image below.
Environment
I am using a 2021 MacBook Pro 14 inch M1 Pro running Safari.
Minimal Reproducible Example
Additional
No response
The text was updated successfully, but these errors were encountered: