Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add patch command #401

Merged
merged 34 commits into from
Sep 29, 2021
Merged

Add patch command #401

merged 34 commits into from
Sep 29, 2021

Conversation

zhiltsov-max
Copy link
Contributor

@zhiltsov-max zhiltsov-max commented Aug 5, 2021

Summary

  • Added datum patch command, which allows to update a dataset from another dataset
  • Added command docs
  • Added ProjectLabels transform to align dataset labels
  • Added label existence check in LabelCategories by name
  • Added an error message on attempt to re-download a source with no hash
  • Improved COCO export docs
  • Fixed return value type in Dataset.is_modified
  • Fixed missing error on saving project readonly project datasets inplace by path
  • Fixed class inheritance for DatasetMergeError (item_id should not be mandatory)
  • Fixed sensitive args in datum diff
  • Fixed label remapping for secondary categories in RemapLabels

How to test

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

@zhiltsov-max zhiltsov-max marked this pull request as draft August 5, 2021 14:32
@zhiltsov-max zhiltsov-max changed the base branch from zm/vcs-simple-exp to develop September 8, 2021 14:30
@zhiltsov-max zhiltsov-max changed the title [Dependent] Add patch command Add patch command Sep 8, 2021
@zhiltsov-max zhiltsov-max marked this pull request as ready for review September 23, 2021 14:58
@zhiltsov-max zhiltsov-max changed the title Add patch command [WIP] Add patch command Sep 23, 2021
@zhiltsov-max zhiltsov-max changed the title [WIP] Add patch command Add patch command Sep 27, 2021
CHANGELOG.md Outdated Show resolved Hide resolved
datumaro/cli/commands/patch.py Outdated Show resolved Hide resolved
datumaro/cli/commands/patch.py Outdated Show resolved Hide resolved
datumaro/cli/commands/patch.py Outdated Show resolved Hide resolved
datumaro/cli/commands/patch.py Outdated Show resolved Hide resolved
dst_mask_cat = MaskCategories(attributes=src_mask_cat.attributes)
dst_mask_cat.colormap = {
id: src_mask_cat[id]
for id, _ in enumerate(src_label_cat.items)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for id, _ in enumerate(src_label_cat.items)
for id in range(len(src_label_cat.items))

if id in src_point_cat and (self._map_id(id) or id == 0)
}
self._categories[AnnotationType.points] = dst_point_cat

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, there should be a check here that no categories of unknown types are present. Otherwise, if a new category type is added in the future and this class is not updated, then old label IDs could leak into the transformed dataset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be a check

Probably, a good point.

could leak

Input categories are not copied in these transforms, so no.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input categories are not copied in these transforms, so no.

Okay, fair point.

However, the annotations are copied. So perhaps we should make sure to only copy annotations of known types so that we can make sure that annotations aren't copied without the appropriate categories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, addition of new annotation type is a rare event, and addition of a new categories is even more rare. I suppose, when it happen, it will require a number of similar changes across the code.

datumaro/plugins/transforms.py Outdated Show resolved Hide resolved
datumaro/plugins/transforms.py Outdated Show resolved Hide resolved
datumaro/util/annotation_util.py Outdated Show resolved Hide resolved
@zhiltsov-max zhiltsov-max merged commit d820179 into develop Sep 29, 2021
@zhiltsov-max zhiltsov-max deleted the zm/patch-command branch September 29, 2021 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants