-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mark datumaro library version when exporting as datumaro #842
mark datumaro library version when exporting as datumaro #842
Conversation
a252a3c
to
fd84f0b
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #842 +/- ##
===========================================
+ Coverage 78.47% 78.53% +0.05%
===========================================
Files 196 196
Lines 24179 24216 +37
Branches 4906 4908 +2
===========================================
+ Hits 18974 19017 +43
+ Misses 4114 4110 -4
+ Partials 1091 1089 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 3 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
12626c6
to
33912c2
Compare
2ce0e3d
to
1e39725
Compare
cbdeffd
to
83e2d36
Compare
I re-organized code as @vinnamkim suggested, please review this PR. |
7ca5580
to
46787c3
Compare
@mark_requirement(Requirements.DATUM_GENERAL_REQ) | ||
def test_export_dm_version(self, test_dir): | ||
dataset = Dataset.from_iterable( | ||
[DatasetItem(id=1, media=Image(data=np.ones((5, 4, 3))))], media_type=Image | ||
) | ||
|
||
legacy_dataset_path = "./tests/assets/datumaro_dataset" | ||
for annot_file in glob.glob( | ||
osp.join( | ||
legacy_dataset_path, | ||
DatumaroPath.ANNOTATIONS_DIR, | ||
"**", | ||
"*" + DatumaroPath.ANNOTATION_EXT, | ||
), | ||
recursive=True, | ||
): | ||
dm_base = DatumaroBase(path=annot_file) | ||
assert ( | ||
dm_base._get_dm_format_version(dm_base._parsed_anns) == DatumaroBase.LEGACY_VERSION | ||
) | ||
|
||
dataset.export(save_dir=test_dir, format="datumaro") | ||
for annot_file in glob.glob( | ||
osp.join( | ||
test_dir, DatumaroPath.ANNOTATIONS_DIR, "**", "*" + DatumaroPath.ANNOTATION_EXT | ||
), | ||
recursive=True, | ||
): | ||
dm_base = DatumaroBase(path=annot_file) | ||
assert dm_base._get_dm_format_version(dm_base._parsed_anns) == DATUMARO_FORMAT_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this TC is already covered by the existing unit tests so that we can remove this. if you want to make a new TC, I recommend testing the broken version compatibility. For example,
@pytest.fixture
def fxt_wrong_version_dir(fxt_test_datumaro_format_dataset, test_dir):
dest_dir = osp.join(test_dir, "wrong_version")
fxt_test_datumaro_format_dataset.export(dest_dir, "datumaro")
< Somehow edit the annotation file to give some weird version to it >
yield dest_dir
def test_version_compatibilty(self, fxt_wrong_version_dir):
with pytest.raises(DatasetImportError):
Dataset.import_from(fxt_wrong_version_dir, "datumaro")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is checking that
exported dm_dataset files have current datumaro format version or not.
As far as I know, existing unit tests doesn't check this.
if I missed some existing tests, please let me know.
BTW, testing the broken version compatibility is a good idea. I'll add the unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the version compatibility is met, the dataset will be normally parsed without any error, and this is covered by the existing test. I think that code quality is important not only for the source but also for the test code (https://www.springboottutorial.com/code-quality-what-is-code-duplication).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If somehow datumaro version string is not stored in the exported dataset,
when exporting back, this dataset will be recognized as 'legacy' which is not appreciable but will be parsed well anyway.
Thus I think that we should check whether the exported datumaro has the current datumaro format version well or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, we need to satisfy the following two things after this PR.
- Succeed to import the dataset without version info (legacy, what has been used)
- Succeed to import the dataset with version = 1.0 (to-be)
- Fail to import the dataset with version > 1.0
2)
will be tested by the existing export and import TCs.
I thought that there has existed a TC for 1) but it doesn't. Sorry for my mistake.
So, it would be better to move tests/assets/datumaro_dataset -> tests/assets/datumaro_dataset/legacy
, create a TC to import from tests/assets/datumaro_dataset/legacy
and conduct compare_datasets()
. For example,
datumaro/tests/unit/data_formats/test_common_semantic_segmentation_format.py
Lines 135 to 144 in a44fc79
def test_can_import( | |
self, | |
fxt_dataset_dir: str, | |
fxt_expected_dataset: Dataset, | |
fxt_import_kwargs: Dict[str, Any], | |
request: pytest.FixtureRequest, | |
): | |
return super().test_can_import( | |
fxt_dataset_dir, fxt_expected_dataset, fxt_import_kwargs, request | |
) |
It is not appealing to call dm_base._get_dm_format_version()
because it's a private function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added test_version_compatibilty()
and 1)
tests as you suggested. Please have a look
Co-authored-by: Vinnam Kim <vinnam.kim@gmail.com>
c90adcf
to
98ddc50
Compare
Co-authored-by: Vinnam Kim <vinnam.kim@gmail.com>
Co-authored-by: Vinnam Kim <vinnam.kim@gmail.com>
3d04f45
to
832e7fe
Compare
@vinnamkim after rebase to the lastest develop, I added |
Co-authored-by: Vinnam Kim <vinnam.kim@gmail.com>
890b98f
to
31e9d8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Co-authored-by: Vinnam Kim <vinnam.kim@gmail.com>
Summary
Resolves https://jira.devtools.intel.com/browse/CVS-105523
This PR adds the datumaro format version into an exported dataset.
How to test
Checklist
develop
branchLicense
Feel free to contact the maintainers if that's a concern.