New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add ConfigurableValidator #1142

Merged

wonjuleee merged 10 commits into openvinotoolkit:develop from wonjuleee:conf_val

Sep 8, 2023

Contributor

wonjuleee commented Sep 6, 2023

Summary

How to test

Checklist

I have added unit tests to cover my changes.
I have added integration tests to cover my changes.
I have added the description of my changes into CHANGELOG.
I have updated the documentation accordingly

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.
I have updated the license header for each file (see an example below).

# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

wonjuleee requested review from a team as code owners

September 6, 2023 05:50

wonjuleee requested review from vinnamkim and removed request for a team

September 6, 2023 05:50

codecov bot commented Sep 6, 2023 •

edited

Loading

Codecov Report

Patch coverage: 84.12% and project coverage change: +0.07% 🎉

Comparison is base (ec00525) 79.86% compared to head (3aa11ab) 79.93%.
Report is 1 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1142      +/-   ##
===========================================
+ Coverage    79.86%   79.93%   +0.07%     
===========================================
  Files          264      265       +1     
  Lines        29391    29895     +504     
  Branches      5753     5889     +136     
===========================================
+ Hits         23473    23898     +425     
- Misses        4583     4640      +57     
- Partials      1335     1357      +22

Flag	Coverage Δ
macos-11_Python-3.8	`79.05% <84.12%> (+0.09%)`	⬆️
ubuntu-20.04_Python-3.8	`79.92% <84.12%> (?)`
windows-2019_Python-3.8	`79.86% <84.12%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
src/datumaro/plugins/configurable_validator.py	`84.12% <84.12%> (ø)`

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.

📢 Have feedback on the report? Share it here.

vinnamkim reviewed

View reviewed changes

src/datumaro/plugins/configurable_validator.py Outdated

+                      pass
+                  def _update_ann_type_stats(self, item_key: tuple, annotation: Annotation):
+                      if self.BBOX_WARNINGS & self.warnings:

Contributor

vinnamkim Sep 6, 2023

I hope that we build a collection of update functions when initializing this class instance according to the input configuration. This is for doing not to check its configuration every time during the dataset item loop. We can iterate on the collection of update functions.

Contributor Author

wonjuleee Sep 7, 2023

I have tried to register function for warning when initialization at cc10ff4. Could you check this?

vinnamkim reviewed

View reviewed changes

src/datumaro/plugins/configurable_validator.py Outdated

Comment on lines 151 to 152

		self.stats["invalid_value"] = set()
		self.stats["negative_length"] = set()

Contributor

vinnamkim Sep 6, 2023

Could we have a typed schema, e.g. @dataclass, for the output data store? It currently uses a dictionary with string key and it is hard to see what is going on this class.

Contributor Author

wonjuleee Sep 7, 2023

I introduced StatsData at cc10ff4.

vinnamkim reviewed

View reviewed changes

src/datumaro/plugins/configurable_validator.py Outdated Show resolved Hide resolved

vinnamkim reviewed

View reviewed changes

src/datumaro/plugins/configurable_validator.py Outdated

Comment on lines 385 to 461

+                          if task == TaskType.classification:
+                              if {MultiLabelAnnotations} & self.warnings:
+                                  reports[task] += self._check_multiple_label(stats)

Contributor

vinnamkim Sep 6, 2023 •

edited

Loading

Although we produce the reports as dict, is it possible to update each part of the dictionary in each collector's member function? I hope that we do not gather all the things getting together in this function.

wonjuleee force-pushed the conf_val branch 3 times, most recently from cc10ff4 to d79f11c Compare

September 7, 2023 05:13

wonjuleee added 8 commits

September 7, 2023 21:39


          init

70cae2c


          add configurable validator

6e0f1e3


          add test

d384a00


          update conf_val

58ca106


          update test

d79f11c


          update func

74c70be


          remove print


          update specs.json

b6f7ba6

vinnamkim reviewed

View reviewed changes

src/datumaro/plugins/configurable_validator.py Outdated

+                          return None
+                  def compute_statistics(self, dataset):
+                      if not self.init_flag:

Contributor

vinnamkim Sep 8, 2023

Do we need this flag? I guess that this class has configurations when it is created. With following these configuration and taking the dataset in this function argument, it produce statistics according to those two inputs. Therefore, we should initialize statistics carriers (self.all_stats) every time we take dataset, right?

Contributor Author

wonjuleee Sep 8, 2023

you meant that we have to compute statistics from scratch when dataset is passed everytime, is this correct?

Contributor

vinnamkim Sep 8, 2023

Yes. However, if you imagine the functionality to update incrementally all_stats by calling compute_statistics() again and again for the sequence of dataset, I think it is better to introduce all_stats as the function argument (not member variables) for the more understandable data flow in this function.

Contributor Author

wonjuleee Sep 8, 2023

I have removed init_flag following your opinion.

src/datumaro/plugins/configurable_validator.py Outdated Show resolved Hide resolved

src/datumaro/plugins/configurable_validator.py Outdated

Comment on lines 949 to 950

		if not stats_collector:
		continue

Contributor

vinnamkim Sep 8, 2023

Do we need this if statement? I think that L892 and self._init_stats_collector() make there is no None in self.all_stats without any exception. Similar to my above comment, wouldn't it be better to let self._init_stats_collector() return all_stats rather than storing it as the member variable? The all_stats are dependent to dataset, so that maintaining it as a member variable seems not good for understanding the data flow in a function.

vinnamkim approved these changes

View reviewed changes

Contributor

vinnamkim left a comment

LGTM.

wonjuleee merged commit 0cdf176 into openvinotoolkit:develop

6 checks passed

wonjuleee added 2 commits

September 8, 2023 18:39


          update changelog

dd2ebdb


          add type per each label_name

3aa11ab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet