Support list query for explorer #1087

sooahleex · 2023-07-10T08:23:34Z

Summary

Same as title.
Support list query of DatasetItem and string for explorer

How to test

I add some unit tests.

Checklist

I have added unit tests to cover my changes.
I have added integration tests to cover my changes.
I have added the description of my changes into CHANGELOG.
I have updated the documentation accordingly

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.
I have updated the license header for each file (see an example below).

# Copyright (C) 2023 Intel Corporation
#
# SPDX-License-Identifier: MIT

bonhunko

I left a minor comment. LGTM!

src/datumaro/components/algorithms/hash_key_inference/explorer.py

vinnamkim · 2023-07-11T02:01:30Z

src/datumaro/components/algorithms/hash_key_inference/explorer.py

+        if isinstance(query, list):
+            topk_for_query = int(topk // len(query)) * 2 if not len(query) == 1 else topk
+            query_hash_key_list = []
+            result_list = []
+            for query_ in query:
+                if isinstance(query_, DatasetItem):
+                    query_key = self._get_hash_key_from_item_query(query_)
+                    query_hash_key_list.append(query_key)
+                elif isinstance(query_, str):
+                    query_key = self._get_hash_key_from_text_query(query_)
+                    query_hash_key_list.append(query_key)
+                else:
+                    raise MediaTypeError(
+                        "Unexpected media type of query '%s'. "
+                        "Expected 'DatasetItem' or 'string', actual'%s'" % (query_, type(query_))
+                    )
+
+            for query_key in query_hash_key_list:
+                unpacked_key = np.unpackbits(query_key.hash_key, axis=-1)
+                logits = calculate_hamming(unpacked_key, database_keys)
+                ind = np.argsort(logits)
+
+                item_list = np.array(self._item_list)[ind]
+                result_list.extend(item_list[:topk_for_query].tolist())
+            return np.random.choice(result_list, topk)


Why is there a non-deterministic process here? Last time you said that the N-to-M top-k similarity search 1) finds the k similarity items for N queries (Nk candidates) and 2) pick up the top-k items from them, is it right?

I updated the process for this. First make Ntopk or Ntopk_for_query candidates and logits for N queries and sort logits by value. Resort candidates through logit indices and pick up top-k items from them.

vinnamkim

LGTM.

sooahleex added the ENHANCE Enhancement of existing features label Jul 10, 2023

sooahleex marked this pull request as ready for review July 11, 2023 01:41

sooahleex requested review from a team as code owners July 11, 2023 01:41

sooahleex requested review from bonhunko and removed request for a team July 11, 2023 01:41

bonhunko approved these changes Jul 11, 2023

View reviewed changes

src/datumaro/components/algorithms/hash_key_inference/explorer.py Outdated Show resolved Hide resolved

vinnamkim reviewed Jul 11, 2023

View reviewed changes

sooahleex force-pushed the enhance/explore_list_query branch from de801fe to c038a5e Compare July 12, 2023 06:10

yunchu added this to the 1.4.0 milestone Jul 12, 2023

sooahleex added 2 commits July 12, 2023 23:09

Support list query for explorer

dfaf118

Update sort for explore result

c038a5e

vinnamkim approved these changes Jul 13, 2023

View reviewed changes

sooahleex merged commit a47d943 into openvinotoolkit:releases/1.4.0 Jul 13, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support list query for explorer #1087

Support list query for explorer #1087

sooahleex commented Jul 10, 2023 •

edited

Loading

bonhunko left a comment

vinnamkim Jul 11, 2023

sooahleex Jul 12, 2023

vinnamkim left a comment

Support list query for explorer #1087

Support list query for explorer #1087

Conversation

sooahleex commented Jul 10, 2023 • edited Loading

Summary

How to test

Checklist

License

bonhunko left a comment

Choose a reason for hiding this comment

vinnamkim Jul 11, 2023

Choose a reason for hiding this comment

sooahleex Jul 12, 2023

Choose a reason for hiding this comment

vinnamkim left a comment

Choose a reason for hiding this comment

sooahleex commented Jul 10, 2023 •

edited

Loading