-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early terminate orderby when columns already sorted #8228
Conversation
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## master #8228 +/- ##
============================================
+ Coverage 64.28% 70.77% +6.48%
+ Complexity 4312 4240 -72
============================================
Files 1581 1629 +48
Lines 83122 85200 +2078
Branches 12587 12827 +240
============================================
+ Hits 53436 60297 +6861
+ Misses 25852 20743 -5109
- Partials 3834 4160 +326
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are only optimizing the case of all order-by columns are sorted, we can have the check within getNextBlock()
and add a new method computeAllSorted()
to only add the first limit
records into the priority queue. Mixing the logic with existing computation will introduce extra overhead, and hard to read
I did consider doing that. At the moment, getNexBlock has a conditional split for all/partially ordered columns (where computeAllOrdered is where all columns in select are also in the order by if I understand correctly ) If we introduce a case for when all orderBy columns are pre-sorted (early termination case) then wouldn't we again have a allOrdered/partialOrdered case within that as well? |
No. If all the order-by columns are sorted, we can simply pick the first |
May be we can get a slack call to resolve this quickly. The code "without this PR" looks like below
Here The two cases will also apply to the early terminate case. So with this optimization we will have 4 cases.
I am not sure this is any more readable that what I have. |
What I am suggesting is something like:
If all order-by columns are sorted, we don't need to do the partially ordered optimization, and the logic is all separated which is more clear and can avoid unnecessary overhead |
Signed-off-by: Jagan <jt@kloudfuse.com>
c70c768
to
8bbca92
Compare
@Jackie-Jiang addressed review comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with minor comments
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise. Please setup the Pinot Style and reformat the changes
pinot-core/src/main/java/org/apache/pinot/core/plan/SelectionPlanNode.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/plan/SelectionPlanNode.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/operator/query/SelectionOrderByOperator.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Jagan <jt@kloudfuse.com>
Description
For a query like, Order by col1, col2 limit k will take all the n matching rows and add it to a priority queue of size k.
This is nlogk operation which can be quite expensive for a large n.
In the above example, if the docs are already sorted by col1, col2 then there is no need for any
sorting at all (only limit is needed).
This PR basically tracks if left most columns are sorted and if they all are, then short circuits the nlogk loop.
Upgrade Notes
Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
Does this PR fix a zero-downtime upgrade introduced earlier?
Does this PR otherwise need attention when creating release notes? Things to consider:
Release Notes
Documentation