-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reuse regex matcher in non FST index LIKE queries #8261
reuse regex matcher in non FST index LIKE queries #8261
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's apply the same improvement to raw value based evaluator
...ava/org/apache/pinot/core/operator/filter/predicate/RegexpLikePredicateEvaluatorFactory.java
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## master #8261 +/- ##
=============================================
- Coverage 64.06% 30.77% -33.29%
=============================================
Files 1586 1619 +33
Lines 83399 85080 +1681
Branches 12641 12834 +193
=============================================
- Hits 53427 26184 -27243
- Misses 26129 56566 +30437
+ Partials 3843 2330 -1513
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@@ -48,6 +50,13 @@ public String getValue() { | |||
return _value; | |||
} | |||
|
|||
public Pattern getPattern() { | |||
if (_pattern == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do this lazily because it's an overhead when an FST index is available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern can be accessed by multiple threads, so might be better to make it volatile
or make it atomic swap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter, the worst that can happen is it gets compiled more than once because the operation is idempotent.
@@ -48,6 +50,13 @@ public String getValue() { | |||
return _value; | |||
} | |||
|
|||
public Pattern getPattern() { | |||
if (_pattern == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern can be accessed by multiple threads, so might be better to make it volatile
or make it atomic swap?
The profile below was taken from one of our customer's deployments - very high allocation rates are observed in no-index LIKE queries, because of matcher construction.
This PR simply reuses the
Matcher
as it will never be used across threads at the segment level.This decreases allocation significantly (2.5x) and may slightly decrease average query time.
before:
after: