Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early terminate orderby when columns already sorted #8228

Merged
merged 9 commits into from
Feb 24, 2022

Conversation

jt-kloudfuse
Copy link
Contributor

@jt-kloudfuse jt-kloudfuse commented Feb 18, 2022

Description

For a query like, Order by col1, col2 limit k will take all the n matching rows and add it to a priority queue of size k.
This is nlogk operation which can be quite expensive for a large n.
In the above example, if the docs are already sorted by col1, col2 then there is no need for any
sorting at all (only limit is needed).

This PR basically tracks if left most columns are sorted and if they all are, then short circuits the nlogk loop.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • [ No]

Does this PR fix a zero-downtime upgrade introduced earlier?

  • [ No]

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • [ No]

Release Notes

Documentation

@codecov-commenter
Copy link

codecov-commenter commented Feb 19, 2022

Codecov Report

Merging #8228 (294b07b) into master (582d621) will increase coverage by 6.48%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #8228      +/-   ##
============================================
+ Coverage     64.28%   70.77%   +6.48%     
+ Complexity     4312     4240      -72     
============================================
  Files          1581     1629      +48     
  Lines         83122    85200    +2078     
  Branches      12587    12827     +240     
============================================
+ Hits          53436    60297    +6861     
+ Misses        25852    20743    -5109     
- Partials       3834     4160     +326     
Flag Coverage Δ
integration1 28.90% <100.00%> (?)
integration2 27.51% <100.00%> (?)
unittests1 66.96% <100.00%> (-0.38%) ⬇️
unittests2 14.09% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../core/operator/query/SelectionOrderByOperator.java 94.30% <100.00%> (+3.73%) ⬆️
.../org/apache/pinot/core/plan/SelectionPlanNode.java 100.00% <100.00%> (ø)
...ls/nativefst/automaton/MinimizationOperations.java 0.00% <0.00%> (-36.20%) ⬇️
...tils/nativefst/automaton/TransitionComparator.java 9.37% <0.00%> (-25.00%) ⬇️
...ent/local/utils/nativefst/automaton/Automaton.java 57.08% <0.00%> (-16.86%) ⬇️
...gment/spi/partition/HashCodePartitionFunction.java 66.66% <0.00%> (-11.12%) ⬇️
...cal/utils/nativefst/automaton/BasicOperations.java 28.57% <0.00%> (-10.85%) ⬇️
...segment/local/utils/nativefst/automaton/State.java 36.58% <0.00%> (-9.76%) ⬇️
.../plugin/inputformat/thrift/ThriftRecordReader.java 86.95% <0.00%> (-3.75%) ⬇️
.../java/org/apache/pinot/spi/data/TimeFieldSpec.java 88.63% <0.00%> (-2.28%) ⬇️
... and 404 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 582d621...294b07b. Read the comment docs.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are only optimizing the case of all order-by columns are sorted, we can have the check within getNextBlock() and add a new method computeAllSorted() to only add the first limit records into the priority queue. Mixing the logic with existing computation will introduce extra overhead, and hard to read

@jt-kloudfuse
Copy link
Contributor Author

Since we are only optimizing the case of all order-by columns are sorted, we can have the check within getNextBlock() and add a new method computeAllSorted() to only add the first limit records into the priority queue. Mixing the logic with existing computation will introduce extra overhead, and hard to read

I did consider doing that.

At the moment, getNexBlock has a conditional split for all/partially ordered columns (where computeAllOrdered is where all columns in select are also in the order by if I understand correctly )

If we introduce a case for when all orderBy columns are pre-sorted (early termination case) then wouldn't we again have a allOrdered/partialOrdered case within that as well?

@Jackie-Jiang
Copy link
Contributor

At the moment, getNexBlock has a conditional split for all/partially ordered columns (where computeAllOrdered is where all columns in select are also in the order by if I understand correctly )

If we introduce a case for when all orderBy columns are pre-sorted (early termination case) then wouldn't we again have a allOrdered/partialOrdered case within that as well?

No. If all the order-by columns are sorted, we can simply pick the first limit docs without worrying about the optimization of partialOrdered case.

@jt-kloudfuse
Copy link
Contributor Author

At the moment, getNexBlock has a conditional split for all/partially ordered columns (where computeAllOrdered is where all columns in select are also in the order by if I understand correctly )
If we introduce a case for when all orderBy columns are pre-sorted (early termination case) then wouldn't we again have a allOrdered/partialOrdered case within that as well?

No. If all the order-by columns are sorted, we can simply pick the first limit docs without worrying about the optimization of partialOrdered case.

May be we can get a slack call to resolve this quickly.

The code "without this PR" looks like below

    if (_expressions.size() == _orderByExpressions.size()) {
      return computeAllOrdered();
    } else {
      return computePartiallyOrdered();
    }

Here computeAllOrdered is where all columns in projection are in order by clause. And, computePartiallyOrdered is where projection has more columns that in order by clause.

The two cases will also apply to the early terminate case. So with this optimization we will have 4 cases.

  • computeAllOrderedWithEarlyTermination
  • computePartiallyOrderedWithEarlyTermination
  • computeAllOrdered
  • computePartiallyOrdered

I am not sure this is any more readable that what I have.

@Jackie-Jiang
Copy link
Contributor

At the moment, getNexBlock has a conditional split for all/partially ordered columns (where computeAllOrdered is where all columns in select are also in the order by if I understand correctly )
If we introduce a case for when all orderBy columns are pre-sorted (early termination case) then wouldn't we again have a allOrdered/partialOrdered case within that as well?

No. If all the order-by columns are sorted, we can simply pick the first limit docs without worrying about the optimization of partialOrdered case.

May be we can get a slack call to resolve this quickly.

The code "without this PR" looks like below

    if (_expressions.size() == _orderByExpressions.size()) {
      return computeAllOrdered();
    } else {
      return computePartiallyOrdered();
    }

Here computeAllOrdered is where all columns in projection are in order by clause. And, computePartiallyOrdered is where projection has more columns that in order by clause.

The two cases will also apply to the early terminate case. So with this optimization we will have 4 cases.

  • computeAllOrderedWithEarlyTermination
  • computePartiallyOrderedWithEarlyTermination
  • computeAllOrdered
  • computePartiallyOrdered

I am not sure this is any more readable that what I have.

What I am suggesting is something like:

    if (allOrderByColumnsSorted()) {
      return computeAllSorted();
    } else if (_expressions.size() == _orderByExpressions.size()) {
      return computeAllOrdered();
    } else {
      return computePartiallyOrdered();
    }

If all order-by columns are sorted, we don't need to do the partially ordered optimization, and the logic is all separated which is more clear and can avoid unnecessary overhead

Signed-off-by: Jagan <jt@kloudfuse.com>
@jt-kloudfuse
Copy link
Contributor Author

@Jackie-Jiang addressed review comment.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good

Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
@jt-kloudfuse jt-kloudfuse marked this pull request as ready for review February 22, 2022 18:29
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments

Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Please setup the Pinot Style and reformat the changes

Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
Signed-off-by: Jagan <jt@kloudfuse.com>
@Jackie-Jiang Jackie-Jiang merged commit a1d440b into apache:master Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants