Optimize in clause evaluation #11557

Jackie-Jiang · 2023-09-11T04:24:42Z

#10254 introduced the sort then merge sort algorithm to handle large in clause. It has time complexity of O(m + n) with m in clause values and n dictionary size. If we do not count the fixed sorting cost (amortized because it is done only once), comparing to binary search with complexity of O(mlogn), it can perform much worse when n >> m.

This PR optimizes the algorithm to do divide and concur: we binary search the middle value of the in clause, then divide the search into 2 parts, each with half of the in values and dictionary range. Overall the time complexity should be within O(m) (when m is large) to O(mlogn) (when m is small), but always better than binary search, and at least on par with the merge sort.

Removed the old query option: inPredicateSortThreshold
Added 2 query options:

inPredicatePreSorted: indicate that the given in clause is already sorted
inPredicateLookupAlgorithm: algorithm to use to solve in clause (default is DIVIDE_BINARY_SEARCH)
- DIVIDE_BINARY_SEARCH: algorithm introduced in this PR
- SCAN: old merge sort algorithm which scans all values
- PLAIN_BINARY_SEARCH: do not sort the in clause values and use vanilla binary search

kishoreg · 2023-09-11T04:36:34Z

Thanks Jackie. lets add some test case.

Jackie-Jiang · 2023-09-11T04:45:08Z

@kishoreg All the existing tests will use the new algorithm (it is on by default because it always outperform binary search)

kishoreg · 2023-09-11T05:10:22Z

can we comment on the performance of this depending on the number of hits

for example sorted values length (m) = 1000
dictionary length (n) = 1M

What is the performance for each of the following strategy when we vary the hits from 0% match to 100% match.

linear search (mlogn)
merge sort o(m+n)
this PR's divide and conquer o(m) to o(mlogn)

kishoreg · 2023-09-11T05:11:09Z

@kishoreg All the existing tests will use the new algorithm (it is on by default because it always outperform binary search)

Thats amazing. may be some numbers will help for future references.

codecov-commenter · 2023-09-11T05:18:46Z

Codecov Report

Merging #11557 (8ebcc80) into master (a8b5fe7) will decrease coverage by 0.02%.
The diff coverage is 56.52%.

@@             Coverage Diff              @@
##             master   #11557      +/-   ##
============================================
- Coverage     63.06%   63.04%   -0.02%     
+ Complexity     1107     1106       -1     
============================================
  Files          2326     2326              
  Lines        124928   124974      +46     
  Branches      19073    19078       +5     
============================================
+ Hits          78787    78793       +6     
- Misses        40542    40572      +30     
- Partials       5599     5609      +10

Flag	Coverage Δ
integration	`<0.01% <0.00%> (ø)`
integration1	`<0.01% <0.00%> (ø)`
integration2	`0.00% <0.00%> (ø)`
java-11	`63.02% <56.52%> (-0.02%)`	⬇️
java-17	`62.90% <56.52%> (+<0.01%)`	⬆️
java-20	`62.90% <56.52%> (+<0.01%)`	⬆️
temurin	`63.04% <56.52%> (-0.02%)`	⬇️
unittests	`63.04% <56.52%> (-0.02%)`	⬇️
unittests1	`67.44% <56.52%> (-0.04%)`	⬇️
unittests2	`14.50% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
...va/org/apache/pinot/spi/utils/CommonConstants.java	`21.11% <ø> (ø)`
...core/operator/filter/predicate/PredicateUtils.java	`41.46% <36.84%> (-0.43%)`	⬇️
...che/pinot/segment/spi/index/reader/Dictionary.java	`63.41% <50.00%> (-0.48%)`	⬇️
...segment/index/readers/BaseImmutableDictionary.java	`83.42% <65.90%> (+0.99%)`	⬆️

... and 9 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

xiangfu0 · 2023-09-11T14:43:09Z

pinot-core/src/main/java/org/apache/pinot/core/operator/filter/predicate/PredicateUtils.java

-        if (queryContext == null || values.size() <= Integer.parseInt(queryContext.getQueryOptions()
-            .getOrDefault(CommonConstants.Broker.Request.QueryOptionKey.IN_PREDICATE_SORT_THRESHOLD,
-                CommonConstants.Broker.Request.QueryOptionValue.DEFAULT_IN_PREDICATE_SORT_THRESHOLD))) {
+        if (queryContext == null || Boolean.parseBoolean(


If else block is always better, then we can remove this?

Just want to give it an option to turn it off in case sorting is dominating the latency (very unlikely though, but who knows)

Jackie-Jiang · 2023-09-11T17:16:00Z

can we comment on the performance of this depending on the number of hits

for example sorted values length (m) = 1000 dictionary length (n) = 1M

What is the performance for each of the following strategy when we vary the hits from 0% match to 100% match.

linear search (mlogn)

merge sort o(m+n)

this PR's divide and conquer o(m) to o(mlogn)

In this specific scenario:

Regular binary search: O(1000log1M)
Merge sort: ~= O(1M)
Divide and conquer: O(log1M + 2 * log500k + 4 * log250k + ... + 512 * log2k) ~= O(1000log4k)

ankitsultana · 2023-09-11T19:23:23Z

@Jackie-Jiang : are there any latency numbers for comparison? (this PR's approach vs no optimization vs sort then merge-sort)

jasperjiaguo · 2023-09-13T19:13:41Z

can we comment on the performance of this depending on the number of hits
for example sorted values length (m) = 1000 dictionary length (n) = 1M
What is the performance for each of the following strategy when we vary the hits from 0% match to 100% match.

linear search (mlogn)

merge sort o(m+n)

this PR's divide and conquer o(m) to o(mlogn)

In this specific scenario:

Regular binary search: O(1000log1M)

Merge sort: ~= O(1M)

Divide and conquer: O(log1M + 2 * log500k + 4 * log250k + ... + 512 * log2k) ~= O(1000log4k)

@Jackie-Jiang So the complexity of divide and conquer is O(log(n) + 2 log(n/2) + ... + m/2 log(n/(m/2))) ~ O(m logn - m logm), so my guess is when m is at the order of sqrt(n), the user would see some benefits if they have pre-sorted entries in the in clause? ~~If the entries are not pre-sorted then sorting it beforehand (takes O(m logm)) is probably not necessary?~~ @vvivekiyer brought up a good point yes it should be helpful if it's one sort per server.

It would be ideal if we can have some JMH benchmarks before making this the default behavior; as the benefit of the optimization might be offsetted by it's complexity when m is small?

Jackie-Jiang · 2023-09-14T06:43:11Z

Added a benchmark and here are the results: (value is OPS, higher the better)

Cardinality	Percentage	DivideBS	PlainBS	Scan	Divide vs Plain	Divide vs Scan
100	1	7659947.834	7193881.873	2590606.363	1.064786435	2.956816575
100	2	4527399.93	3702032.137	2297514.754	1.222949927	1.970564029
100	4	2298448.048	2219638.52	884871.898	1.035505569	2.59749242
100	8	1624385.705	1188408.158	730888.462	1.366858427	2.22248098
100	16	789605.698	561158.422	466697.802	1.407099434	1.691899329
100	32	451038.526	270074.742	414345.056	1.670050752	1.08855776
100	64	249953.379	145491.186	306353.403	1.717996711	0.8158988167
100	100	193388.569	81558.625	252231.273	2.371160242	0.766711307
1000	1	673035.271	604210.035	53069.877	1.113909455	12.68205824
1000	2	331272.455	296964.057	56142.286	1.115530473	5.900587215
1000	4	192433.934	154232.986	53796.693	1.247683385	3.577058798
1000	8	120568.939	67105.009	46336.664	1.796720406	2.602020271
1000	16	71785.317	33548.834	41131.107	2.139726138	1.745280452
1000	32	42476.434	13445.678	33515.651	3.159114327	1.26736115
1000	64	22895.909	7281.486	26235.134	3.144400607	0.8727193465
1000	100	16424.202	4369.1	25628.041	3.759172827	0.6408684144
10000	1	56327.402	36695.237	5402.282	1.53500581	10.42659417
10000	2	34808.144	19821.462	5119.112	1.756083583	6.799644938
10000	4	17176.683	8525.903	4878.576	2.014646777	3.520839483
10000	8	7889.65	4484.064	4283.167	1.759486484	1.842013165
10000	16	4291.993	2175.905	3454.031	1.97250937	1.242604076
10000	32	2414.918	1111.743	2897.232	2.172190875	0.833525931
10000	64	1514.006	569.737	1955.941	2.657377	0.7740550456
10000	100	1319.788	380.992	2126.729	3.464083235	0.6205717795
100000	1	3995.728	2898.882	500.751	1.378368626	7.979470835
100000	2	2205.853	1299.111	496.293	1.697971151	4.4446587
100000	4	1309.562	752.165	453.279	1.741056816	2.889085971
100000	8	759.235	355.94	416.158	2.133042086	1.824391217
100000	16	413.461	168.189	338.497	2.45831178	1.221461342
100000	32	210.711	74.033	234.106	2.846176705	0.9000666365
100000	64	103.367	38.865	134.722	2.659642352	0.7672614718
100000	100	70.296	24.18	96.128	2.90719603	0.7312749667
1000000	1	330.142	184.781	47.947	1.786666378	6.88556114
1000000	2	147.552	89.482	48.158	1.648957332	3.063914614
1000000	4	75.214	48.161	38.61	1.561720064	1.948044548
1000000	8	39.693	23.755	30.018	1.670932435	1.322306616
1000000	16	23.65	12.317	23.324	1.920110416	1.013977019
1000000	32	12.367	5.821	13.327	2.124549047	0.9279657837
1000000	64	6.577	2.959	7.576	2.222710375	0.8681362196
1000000	100	4.734	1.932	5.345	2.450310559	0.8856875585

As shown above, divide binary search always outperforms plain binary search. When the lookup value percentage is very high (over half of the dictionary), scan start outperforming divide binary search, but not by much.
Based on this, I've decided to keep all 3 algorithms and use divide binary search as the default

Jackie-Jiang added enhancement documentation release-notes Referenced by PRs that need attention when compiling the next release notes performance labels Sep 11, 2023

Jackie-Jiang requested a review from xiangfu0 September 11, 2023 04:24

Jackie-Jiang force-pushed the optimize_in_evaluation branch from a4aa804 to c7a35ea Compare September 11, 2023 04:26

xiangfu0 approved these changes Sep 11, 2023

View reviewed changes

xiangfu0 reviewed Sep 11, 2023

View reviewed changes

Jackie-Jiang force-pushed the optimize_in_evaluation branch from c7a35ea to c2254ce Compare September 14, 2023 06:24

Optimize in clause evaluation

8ebcc80

Jackie-Jiang force-pushed the optimize_in_evaluation branch from c2254ce to 8ebcc80 Compare September 14, 2023 07:17

Jackie-Jiang merged commit 8abe86b into apache:master Sep 14, 2023
21 checks passed

Jackie-Jiang deleted the optimize_in_evaluation branch September 14, 2023 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize in clause evaluation #11557

Optimize in clause evaluation #11557

Jackie-Jiang commented Sep 11, 2023 •

edited

Loading

kishoreg commented Sep 11, 2023

Jackie-Jiang commented Sep 11, 2023

kishoreg commented Sep 11, 2023

kishoreg commented Sep 11, 2023

codecov-commenter commented Sep 11, 2023 •

edited

Loading

xiangfu0 Sep 11, 2023

Jackie-Jiang Sep 11, 2023

Jackie-Jiang commented Sep 11, 2023

ankitsultana commented Sep 11, 2023

jasperjiaguo commented Sep 13, 2023 •

edited

Loading

Jackie-Jiang commented Sep 14, 2023 •

edited

Loading

Optimize in clause evaluation #11557

Optimize in clause evaluation #11557

Conversation

Jackie-Jiang commented Sep 11, 2023 • edited Loading

kishoreg commented Sep 11, 2023

Jackie-Jiang commented Sep 11, 2023

kishoreg commented Sep 11, 2023

kishoreg commented Sep 11, 2023

codecov-commenter commented Sep 11, 2023 • edited Loading

Codecov Report

xiangfu0 Sep 11, 2023

Choose a reason for hiding this comment

Jackie-Jiang Sep 11, 2023

Choose a reason for hiding this comment

Jackie-Jiang commented Sep 11, 2023

ankitsultana commented Sep 11, 2023

jasperjiaguo commented Sep 13, 2023 • edited Loading

Jackie-Jiang commented Sep 14, 2023 • edited Loading

Jackie-Jiang commented Sep 11, 2023 •

edited

Loading

codecov-commenter commented Sep 11, 2023 •

edited

Loading

jasperjiaguo commented Sep 13, 2023 •

edited

Loading

Jackie-Jiang commented Sep 14, 2023 •

edited

Loading