Add intermediate skill docs for filter #996

sooahleex · 2023-05-11T05:53:38Z

Summary

Ticket no.107285
Update intermediate skill documentation for filter
- Add python, CLI, ProjectCLI examples

How to test

Checklist

I have added unit tests to cover my changes.
I have added integration tests to cover my changes.
I have added the description of my changes into CHANGELOG.
I have updated the documentation accordingly

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.
I have updated the license header for each file (see an example below).

# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

codecov-commenter · 2023-05-11T06:12:58Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (8fe4cf0) 78.53% compared to head (be397cc) 78.53%.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop     #996   +/-   ##
========================================
  Coverage    78.53%   78.53%           
========================================
  Files          233      233           
  Lines        26749    26749           
  Branches      5320     5320           
========================================
+ Hits         21007    21008    +1     
  Misses        4497     4497           
+ Partials      1245     1244    -1

Flag	Coverage Δ
macos-11_Python-3.8	`77.53% <ø> (+<0.01%)`	⬆️
ubuntu-20.04_Python-3.8	`78.51% <ø> (+<0.01%)`	⬆️
windows-2019_Python-3.8	`78.42% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

docs/source/docs/level-up/intermediate_skills/09_data_filtering.rst

docs/source/docs/level-up/intermediate_skills/index.rst

docs/source/docs/level-up/intermediate_skills/09_data_filtering.rst

vinnamkim · 2023-05-12T08:03:44Z

docs/source/docs/level-up/intermediate_skills/09_data_filtering.rst

+            datum filter -e <how/to/filter/dataset> --project <path/to/project>
+
+        We can set ``<how/to/filter/dataset>`` as your own filter like ``'/item/annotation[label="cat" and area > 85]'``.
+        This example commands will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.


Suggested change

This example commands will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.

This example command will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.

To my knowledge, '/item/annotation[...]' removes annotations not items (actually '/item[annotation...]' removes items), but your sentence seems removing the items themselves.

datumaro/datumaro/cli/commands/filter.py

Lines 73 to 82 in 8fe4cf0

- Filter images with large-area bboxes:|n

|s|s%(prog)s -e '/item[annotation/type="bbox" and

annotation/area>2000]'|n

|n

- Filter out all irrelevant annotations from items:|n

|s|s%(prog)s -m a -e '/item/annotation[label = "person"]'|n

|n

- Filter out all irrelevant annotations from items:|n

|s|s%(prog)s -m a -e '/item/annotation[label="cat" and

area > 99.5]'|n

Did you verify the actual behavior of Datumaro for this command? The scope of writing skill up page task contains the verification also.

I checked this command myself. Before this command, the dataset had 5000 items, and after this command the dataset result only had 184 items. As I knew, the filter works by item.

I closely looked into it and found that --mode is an important argument for this problem. This is because it determines whether to filter items or annotations. Therefore, explanation for --mode should be added to this section.

datumaro/datumaro/cli/commands/filter.py

Lines 97 to 104 in 8fe4cf0

parser.add_argument(

"-m",

"--mode",

default=FilterModes.i.name,

type=FilterModes.parse,

help="Filter mode (options: %s; default: %s)"

% (", ".join(FilterModes.list_options()), "%(default)s"),

)

datumaro/datumaro/cli/util/project.py

Lines 193 to 204 in 8fe4cf0

def make_filter_args(cls, mode):

if mode == cls.items:

return {}

elif mode == cls.annotations:

return {"filter_annotations": True}

elif mode == cls.items_annotations:

return {

"filter_annotations": True,

"remove_empty": True,

}

else:

raise NotImplementedError()

datumaro/datumaro/components/dataset.py

Lines 897 to 900 in 8fe4cf0

if filter_annotations:

return self.transform(XPathAnnotationsFilter, xpath=expr, remove_empty=remove_empty)

else:

return self.transform(XPathDatasetFilter, xpath=expr)

datumaro/datumaro/components/filter.py

Lines 289 to 291 in 8fe4cf0

if self._remove_empty and len(annotations) == 0:

return None

return self.wrap_item(item, annotations=annotations)

What did you give for --mode? It seems that you gave -m i so that the items disappeared. Then, what is the difference between '/item/annotation[...]' and '/item[annotation...]'?

When I applied the command like datum filter source-1 -e '/item/annotation[label="cat" and area > 85]', dinfo for source-1 is

length: 184 categories: label label: count: 80 labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more) subsets: val2017 'val2017': length: 184 categories: label label: count: 80 labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more)

And for the command datum filter source-4 -m a -e '/item/annotation[label="cat" and area > 85]', dinfo for source-4 is

length: 5000 categories: label label: count: 80 labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more) subsets: val2017 'val2017': length: 5000 categories: label label: count: 80 labels: person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light (and 70 more)

I imported source-1 and source-4 as same dataset, which is coco val2017. And if I applied for '/annotation[label="cat" and area > 85]' as filter, It does not work out.

For this command explanation, this part is included in filter.md and I linked this page in intermediate skill page for filter. In intermediate skill, we just give simple example to users and if users want more detailed information, they can go to the filter.md, check it, and set the mode and filter according to them.

vinnamkim

LGTM.

sooahleex changed the title ~~Update intermediate skill docs for filter~~ Add intermediate skill docs for filter May 11, 2023

sooahleex force-pushed the doc/filter_intermediate branch from 79170b1 to b44bea6 Compare May 11, 2023 05:59

sooahleex marked this pull request as ready for review May 11, 2023 06:01

sooahleex requested review from a team as code owners May 11, 2023 06:01

sooahleex requested review from vinnamkim, chuneuny-emily, sstrehlk and wonjuleee May 11, 2023 06:01

sooahleex added 3 commits May 11, 2023 23:00

Update intermediate skill docs for filter

1104c4d

Update CHANGELOG

fcdd553

Fix Linter

b44bea6

vinnamkim reviewed May 12, 2023

View reviewed changes

docs/source/docs/level-up/intermediate_skills/09_data_filtering.rst Show resolved Hide resolved

docs/source/docs/level-up/intermediate_skills/index.rst Show resolved Hide resolved

docs/source/docs/level-up/intermediate_skills/09_data_filtering.rst Outdated Show resolved Hide resolved

sooahleex added this to the 1.3.0 milestone May 12, 2023

sooahleex added the DOC Improvements or additions to documentation label May 12, 2023

vinnamkim reviewed May 12, 2023

View reviewed changes

docs/source/docs/level-up/intermediate_skills/09_data_filtering.rst Outdated Show resolved Hide resolved

vinnamkim reviewed May 12, 2023

View reviewed changes

sooahleex added 3 commits May 12, 2023 18:59

Add example for how to set filter and match number of docs

d827dda

Update example command explanation

ff3babf

Update explanation

be397cc

vinnamkim approved these changes May 16, 2023

View reviewed changes

cih9088 approved these changes May 16, 2023

View reviewed changes

sooahleex merged commit 332879d into openvinotoolkit:develop May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add intermediate skill docs for filter #996

Add intermediate skill docs for filter #996

sooahleex commented May 11, 2023 •

edited

Loading

codecov-commenter commented May 11, 2023 •

edited

Loading

vinnamkim May 12, 2023

vinnamkim May 12, 2023 •

edited

Loading

sooahleex May 15, 2023 •

edited

Loading

vinnamkim May 15, 2023

sooahleex May 15, 2023

vinnamkim left a comment

	This example commands will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.
	This example command will filter only items through the bbox annotations which have `cat` label and bbox area (`w * h`) more than 85.

	- Filter images with large-area bboxes:\|n
	\|s\|s%(prog)s -e '/item[annotation/type="bbox" and
	annotation/area>2000]'\|n
	\|n
	- Filter out all irrelevant annotations from items:\|n
	\|s\|s%(prog)s -m a -e '/item/annotation[label = "person"]'\|n
	\|n
	- Filter out all irrelevant annotations from items:\|n
	\|s\|s%(prog)s -m a -e '/item/annotation[label="cat" and
	area > 99.5]'\|n

	parser.add_argument(
	"-m",
	"--mode",
	default=FilterModes.i.name,
	type=FilterModes.parse,
	help="Filter mode (options: %s; default: %s)"
	% (", ".join(FilterModes.list_options()), "%(default)s"),
	)

	def make_filter_args(cls, mode):
	if mode == cls.items:
	return {}
	elif mode == cls.annotations:
	return {"filter_annotations": True}
	elif mode == cls.items_annotations:
	return {
	"filter_annotations": True,
	"remove_empty": True,
	}
	else:
	raise NotImplementedError()

	if filter_annotations:
	return self.transform(XPathAnnotationsFilter, xpath=expr, remove_empty=remove_empty)
	else:
	return self.transform(XPathDatasetFilter, xpath=expr)

	if self._remove_empty and len(annotations) == 0:
	return None
	return self.wrap_item(item, annotations=annotations)

Add intermediate skill docs for filter #996

Add intermediate skill docs for filter #996

Conversation

sooahleex commented May 11, 2023 • edited Loading

Summary

How to test

Checklist

License

codecov-commenter commented May 11, 2023 • edited Loading

Codecov Report

vinnamkim May 12, 2023

Choose a reason for hiding this comment

vinnamkim May 12, 2023 • edited Loading

Choose a reason for hiding this comment

sooahleex May 15, 2023 • edited Loading

Choose a reason for hiding this comment

vinnamkim May 15, 2023

Choose a reason for hiding this comment

sooahleex May 15, 2023

Choose a reason for hiding this comment

vinnamkim left a comment

Choose a reason for hiding this comment

sooahleex commented May 11, 2023 •

edited

Loading

codecov-commenter commented May 11, 2023 •

edited

Loading

vinnamkim May 12, 2023 •

edited

Loading

sooahleex May 15, 2023 •

edited

Loading