Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop DatumaroBinaryFormat to export/import the dataset header & DatasetItem #828

Conversation

vinnamkim
Copy link
Contributor

@vinnamkim vinnamkim commented Feb 27, 2023

Summary

  • Ticket no. 104650
  • Develop DatumaroBinaryFormat to export/import the dataset header & DatasetItem (annotations are not completely finished and the subsequent PR will cover more)

How to test

The added tests in this PR cover the changes.

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
 - Refactor CommonSemanticSegmentation unit tests as well

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
 - Support DatasetItem by Datumaro binary format but annotations are not yet.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@vinnamkim vinnamkim added this to the 1.1.0 milestone Feb 27, 2023
@vinnamkim vinnamkim added ENHANCE Enhancement of existing features data formats PR is related to dataset formats labels Feb 27, 2023
@vinnamkim vinnamkim changed the title Feature/add datumaro binary format dataset item Develop DatumaroBinaryFormat to export/import the dataset header & DatasetItem Feb 27, 2023
Copy link
Contributor

@wonjuleee wonjuleee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some duplicates from previous PRs. Could you separate them first?

datumaro/components/format_detection.py Outdated Show resolved Hide resolved
datumaro/components/importer.py Outdated Show resolved Hide resolved
datumaro/components/media.py Show resolved Hide resolved
datumaro/plugins/data_formats/datumaro/base.py Outdated Show resolved Hide resolved
…maro-binary-format-dataset-item

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@vinnamkim vinnamkim marked this pull request as ready for review February 28, 2023 05:31
@vinnamkim
Copy link
Contributor Author

There are some duplicates from previous PRs. Could you separate them first?

It's ready for review now.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2023

Codecov Report

Base: 78.33% // Head: 78.47% // Increases project coverage by +0.14% 🎉

Coverage data is based on head (7fcd938) compared to base (79b9eb1).
Patch coverage: 92.46% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #828      +/-   ##
===========================================
+ Coverage    78.33%   78.47%   +0.14%     
===========================================
  Files          189      191       +2     
  Lines        23681    23911     +230     
  Branches      4895     4912      +17     
===========================================
+ Hits         18550    18764     +214     
- Misses        4032     4044      +12     
- Partials      1099     1103       +4     
Flag Coverage Δ
macos-11_Python-3.8 77.79% <92.46%> (+0.15%) ⬆️
ubuntu-20.04_Python-3.8 78.45% <92.46%> (+0.14%) ⬆️
windows-2019_Python-3.8 78.39% <92.46%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
datumaro/util/image.py 90.39% <ø> (ø)
...aro/plugins/data_formats/datumaro_binary/format.py 88.23% <75.00%> (-11.77%) ⬇️
...aro/plugins/data_formats/datumaro_binary/mapper.py 86.41% <86.41%> (ø)
...ro/plugins/data_formats/datumaro_binary/crypter.py 92.30% <92.30%> (ø)
...umaro/plugins/data_formats/datumaro_binary/base.py 95.34% <95.12%> (-4.66%) ⬇️
...o/plugins/data_formats/datumaro_binary/exporter.py 96.07% <95.34%> (-3.93%) ⬇️
datumaro/components/format_detection.py 92.98% <100.00%> (+0.34%) ⬆️
datumaro/components/media.py 83.42% <100.00%> (+1.02%) ⬆️
...ugins/data_formats/common_semantic_segmentation.py 82.02% <100.00%> (+0.62%) ⬆️
datumaro/plugins/data_formats/datumaro/base.py 93.07% <100.00%> (+0.10%) ⬆️
... and 5 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@vinnamkim
Copy link
Contributor Author

vinnamkim commented Feb 28, 2023

@chuneuny-emily, @yunchu

I added 454b22c because of the recent CI failure: https://github.com/openvinotoolkit/datumaro/actions/runs/4289888756/jobs/7473354523.

This is because our test_utils.py depends on tfds.testing.mock_data but our testing environment doesn't have the dependency of tfds.testing.

with tfds.testing.mock_data(num_examples=NUM_EXAMPLES, as_dataset_fn=as_dataset):

After this time, I think that it would be good to move test_utils.py from ./datumaro to ./tests and explicitly make a testing (dev) environment requirements under ./tests, e.g. ./tests/requirements.txt.

Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
Signed-off-by: Kim, Vinnam <vinnam.kim@intel.com>
@vinnamkim vinnamkim merged commit ca79fbd into openvinotoolkit:develop Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data formats PR is related to dataset formats ENHANCE Enhancement of existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants