Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise ImportError on importing malformed COCO directory #812

Conversation

vinnamkim
Copy link
Contributor

@vinnamkim vinnamkim commented Feb 17, 2023

Summary

  • Previously, COCOImporter implicitly set its rootpath="" and images_dir="" unless the directory structure is not well formed. This makes users cannot know what's happening although their dataset structure is malformed. As a result, after importing, users try to load its images but fail and don't know why.
  • This patch makes COCO directory structure rule more strictly at the dataset import level to prevent misbehavior in the next steps.
  • In addition, add a checker to check whether Image has correct dimensions: 2 or 3. There have been many test cases which let Image() not having 3 dimensions. This PR also fixes it.

How to test

I added a unit test for it.

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

 - Previously, COCOImporter implicitly set its rootpath and images_dir
unless the directory structure is not well formed. This makes users cannot know what's happening although their dataset structure is
malformed. As a result, after importing, users try to load its images but
fail and don't know why.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@vinnamkim vinnamkim marked this pull request as ready for review February 17, 2023 05:14
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@vinnamkim vinnamkim added ENHANCE Enhancement of existing features data formats PR is related to dataset formats labels Feb 17, 2023
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
sooahleex
sooahleex previously approved these changes Feb 17, 2023
Copy link
Contributor

@sooahleex sooahleex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good for me. I just have one question, you tested some case for wrong structure like # Wrong structure: ./annotations -> ./labels or # Wrong structure: ./images -> ./imgs, then do we need to consider those kinds of various wrong structure?

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@vinnamkim
Copy link
Contributor Author

vinnamkim commented Feb 17, 2023

It looks good for me. I just have one question, you tested some case for wrong structure like # Wrong structure: ./annotations -> ./labels or # Wrong structure: ./images -> ./imgs, then do we need to consider those kinds of various wrong structure?

It looks good for me. I just have one question, you tested some case for wrong structure like # Wrong structure: ./annotations -> ./labels or # Wrong structure: ./images -> ./imgs, then do we need to consider those kinds of various wrong structure?

In those cases, this PR wants to raise DatasetImportError to tell a user that you should follow COCO directory structure rule. After this PR, the only directory structure exactly fitted to COCO directory structure rule will be accepted.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@codecov-commenter
Copy link

codecov-commenter commented Feb 17, 2023

Codecov Report

Base: 77.62% // Head: 77.62% // Decreases project coverage by -0.01% ⚠️

Coverage data is based on head (232b2f6) compared to base (f25f2d8).
Patch coverage: 89.47% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #812      +/-   ##
===========================================
- Coverage    77.62%   77.62%   -0.01%     
===========================================
  Files          184      184              
  Lines        23554    23563       +9     
  Branches      4879     4880       +1     
===========================================
+ Hits         18284    18290       +6     
- Misses        4198     4200       +2     
- Partials      1072     1073       +1     
Flag Coverage Δ
macos-11_Python-3.8 77.60% <89.47%> (-0.01%) ⬇️
ubuntu-20.04_Python-3.8 77.60% <89.47%> (-0.01%) ⬇️
windows-2019_Python-3.8 77.53% <89.47%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
datumaro/plugins/data_formats/coco/base.py 91.30% <85.71%> (-0.67%) ⬇️
datumaro/components/errors.py 87.17% <100.00%> (+0.06%) ⬆️
datumaro/components/media.py 82.40% <100.00%> (+0.15%) ⬆️
datumaro/plugins/ndr.py 75.00% <0.00%> (-1.09%) ⬇️
datumaro/plugins/splitter.py 88.66% <0.00%> (+0.22%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@vinnamkim vinnamkim added this to the 1.0.0 milestone Feb 17, 2023
@vinnamkim vinnamkim added this pull request to the merge queue Feb 17, 2023
Merged via the queue into openvinotoolkit:develop with commit a15cf39 Feb 17, 2023
vinnamkim added a commit that referenced this pull request Feb 21, 2023
* Add daily/weekly test triggers (#811)

* Add daily/weekly checks in schedule

* Update for testing

* Fix version

* Fix version

* Fix version

* Update ttest configuration

* Rename daily to nightly

* Apply some feedbacks

* Apply some feedbacks

* Correct wrong file path.

* Raise ImportError on importing malformed COCO directory (#812)

* Raise ImportError on importing malformed COCO directory

 - Previously, COCOImporter implicitly set its rootpath and images_dir
unless the directory structure is not well formed. This makes users cannot know what's happening although their dataset structure is
malformed. As a result, after importing, users try to load its images but
fail and don't know why.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Update CHANGELOG.md

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix isort

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix unittest

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Image should have 2 or 3 dims

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Upload data explorer model in public storage (#813)

* Upload model in public storage

* Update model_dir

* Make dir

* Skip test if macos

* Skip unit tests if macos

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: Sooah Lee <sooah.lee@intel.com>
vinnamkim added a commit that referenced this pull request Feb 22, 2023
* Update version to release v1.0 early version to PyPI

* Dev to 1.0.0 pre-release (#817)

* Add daily/weekly test triggers (#811)

* Add daily/weekly checks in schedule

* Update for testing

* Fix version

* Fix version

* Fix version

* Update ttest configuration

* Rename daily to nightly

* Apply some feedbacks

* Apply some feedbacks

* Correct wrong file path.

* Raise ImportError on importing malformed COCO directory (#812)

* Raise ImportError on importing malformed COCO directory

 - Previously, COCOImporter implicitly set its rootpath and images_dir
unless the directory structure is not well formed. This makes users cannot know what's happening although their dataset structure is
malformed. As a result, after importing, users try to load its images but
fail and don't know why.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Update CHANGELOG.md

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix isort

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix unittest

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Image should have 2 or 3 dims

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Upload data explorer model in public storage (#813)

* Upload model in public storage

* Update model_dir

* Make dir

* Skip test if macos

* Skip unit tests if macos

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: Sooah Lee <sooah.lee@intel.com>

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Co-authored-by: chuneuny <emily.chun@intel.com>
Co-authored-by: Sooah Lee <sooah.lee@intel.com>
wonjuleee pushed a commit that referenced this pull request Mar 10, 2023
* Update version to release v1.0 early version to PyPI

* Dev to 1.0.0 pre-release (#817)

* Add daily/weekly test triggers (#811)

* Add daily/weekly checks in schedule

* Update for testing

* Fix version

* Fix version

* Fix version

* Update ttest configuration

* Rename daily to nightly

* Apply some feedbacks

* Apply some feedbacks

* Correct wrong file path.

* Raise ImportError on importing malformed COCO directory (#812)

* Raise ImportError on importing malformed COCO directory

 - Previously, COCOImporter implicitly set its rootpath and images_dir
unless the directory structure is not well formed. This makes users cannot know what's happening although their dataset structure is
malformed. As a result, after importing, users try to load its images but
fail and don't know why.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Update CHANGELOG.md

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix isort

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix unittest

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Image should have 2 or 3 dims

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Fix testcase

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Upload data explorer model in public storage (#813)

* Upload model in public storage

* Update model_dir

* Make dir

* Skip test if macos

* Skip unit tests if macos

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: Sooah Lee <sooah.lee@intel.com>

* [Release] Fix/update 3rd party txt (#820)

* Update 3rd-party.txt

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Convert tab to spaces

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>

* Remove the duplicated contexts

* Enable doc build on release branch

* Update changelog

* Apply code review feedbacks

* Remove a blank line

* Update version.py

* Update CHANGELOG.md

* Update CHANGELOG.md

* Add some description (#832)

---------

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Co-authored-by: chuneuny <emily.chun@intel.com>
Co-authored-by: Sooah Lee <sooah.lee@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data formats PR is related to dataset formats ENHANCE Enhancement of existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants