Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add intermediate skill docs for filter #996

Merged
merged 6 commits into from
May 17, 2023

Conversation

sooahleex
Copy link
Contributor

@sooahleex sooahleex commented May 11, 2023

Summary

  • Ticket no.107285
  • Update intermediate skill documentation for filter
    • Add python, CLI, ProjectCLI examples

How to test

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added the description of my changes into CHANGELOG.​
  • I have updated the documentation accordingly

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

@sooahleex sooahleex changed the title Update intermediate skill docs for filter Add intermediate skill docs for filter May 11, 2023
@sooahleex sooahleex marked this pull request as ready for review May 11, 2023 06:01
@sooahleex sooahleex requested review from a team as code owners May 11, 2023 06:01
@codecov-commenter
Copy link

codecov-commenter commented May 11, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (8fe4cf0) 78.53% compared to head (be397cc) 78.53%.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #996   +/-   ##
========================================
  Coverage    78.53%   78.53%           
========================================
  Files          233      233           
  Lines        26749    26749           
  Branches      5320     5320           
========================================
+ Hits         21007    21008    +1     
  Misses        4497     4497           
+ Partials      1245     1244    -1     
Flag Coverage Δ
macos-11_Python-3.8 77.53% <ø> (+<0.01%) ⬆️
ubuntu-20.04_Python-3.8 78.51% <ø> (+<0.01%) ⬆️
windows-2019_Python-3.8 78.42% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sooahleex sooahleex added this to the 1.3.0 milestone May 12, 2023
@sooahleex sooahleex added the DOC Improvements or additions to documentation label May 12, 2023
datum filter -e <how/to/filter/dataset> --project <path/to/project>

We can set ``<how/to/filter/dataset>`` as your own filter like ``'/item/annotation[label="cat" and area > 85]'``.
This example commands will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This example commands will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.
This example command will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.

Copy link
Contributor

@vinnamkim vinnamkim May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my knowledge, '/item/annotation[...]' removes annotations not items (actually '/item[annotation...]' removes items), but your sentence seems removing the items themselves.

- Filter images with large-area bboxes:|n
|s|s%(prog)s -e '/item[annotation/type="bbox" and
annotation/area>2000]'|n
|n
- Filter out all irrelevant annotations from items:|n
|s|s%(prog)s -m a -e '/item/annotation[label = "person"]'|n
|n
- Filter out all irrelevant annotations from items:|n
|s|s%(prog)s -m a -e '/item/annotation[label="cat" and
area > 99.5]'|n

Did you verify the actual behavior of Datumaro for this command? The scope of writing skill up page task contains the verification also.

Copy link
Contributor Author

@sooahleex sooahleex May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this command myself. Before this command, the dataset had 5000 items, and after this command the dataset result only had 184 items. As I knew, the filter works by item.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I closely looked into it and found that --mode is an important argument for this problem. This is because it determines whether to filter items or annotations. Therefore, explanation for --mode should be added to this section.

parser.add_argument(
"-m",
"--mode",
default=FilterModes.i.name,
type=FilterModes.parse,
help="Filter mode (options: %s; default: %s)"
% (", ".join(FilterModes.list_options()), "%(default)s"),
)

def make_filter_args(cls, mode):
if mode == cls.items:
return {}
elif mode == cls.annotations:
return {"filter_annotations": True}
elif mode == cls.items_annotations:
return {
"filter_annotations": True,
"remove_empty": True,
}
else:
raise NotImplementedError()

if filter_annotations:
return self.transform(XPathAnnotationsFilter, xpath=expr, remove_empty=remove_empty)
else:
return self.transform(XPathDatasetFilter, xpath=expr)

if self._remove_empty and len(annotations) == 0:
return None
return self.wrap_item(item, annotations=annotations)

What did you give for --mode? It seems that you gave -m i so that the items disappeared. Then, what is the difference between '/item/annotation[...]' and '/item[annotation...]'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I applied the command like datum filter source-1 -e '/item/annotation[label="cat" and area > 85]', dinfo for source-1 is

length: 184
categories: label
  label:
    count: 80
    labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more)
subsets: val2017
  'val2017':
    length: 184
    categories: label
      label:
        count: 80
        labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more)

And for the command datum filter source-4 -m a -e '/item/annotation[label="cat" and area > 85]', dinfo for source-4 is

length: 5000
categories: label
  label:
    count: 80
    labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more)
subsets: val2017
  'val2017':
    length: 5000
    categories: label
      label:
        count: 80
        labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more)

I imported source-1 and source-4 as same dataset, which is coco val2017. And if I applied for '/annotation[label="cat" and area > 85]' as filter, It does not work out.

For this command explanation, this part is included in filter.md and I linked this page in intermediate skill page for filter. In intermediate skill, we just give simple example to users and if users want more detailed information, they can go to the filter.md, check it, and set the mode and filter according to them.

Copy link
Contributor

@vinnamkim vinnamkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@sooahleex sooahleex merged commit 332879d into openvinotoolkit:develop May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DOC Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants