Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify segment pruning #8790

Merged

Conversation

richardstartin
Copy link
Member

@richardstartin richardstartin commented May 27, 2022

Segment pruning feels over-engineered:

  • DataSchemaSegmentPruner and ValidSegmentPruner are just one liners which can be applied when necessary
  • ColumnValueSegmentPruner and SelectionQuerySegmentPruner are mutually exclusive so never both need to run, and these cases can be identified easily by examining the QueryContext
  • No new segment pruners have been in a very long time

This leads to inefficiencies like

  • None of the pruners inline
  • Lots of lists are created unnecessarily
  • We end up tracing one liner checks

This PR removes the two trivial pruners and applies them inline the SegmentPrunerService before the two remaining pruners. It adds a new method to identify based on the query context whether the pruner should run at all.

@codecov-commenter
Copy link

codecov-commenter commented May 27, 2022

Codecov Report

Merging #8790 (0e7298a) into master (11a060c) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master    #8790      +/-   ##
============================================
- Coverage     69.63%   69.63%   -0.01%     
+ Complexity     4621     4619       -2     
============================================
  Files          1736     1733       -3     
  Lines         91203    91188      -15     
  Branches      13632    13630       -2     
============================================
- Hits          63513    63497      -16     
- Misses        23269    23272       +3     
+ Partials       4421     4419       -2     
Flag Coverage Δ
integration1 27.00% <96.55%> (+0.02%) ⬆️
integration2 25.38% <100.00%> (+0.12%) ⬆️
unittests1 66.15% <93.10%> (-0.06%) ⬇️
unittests2 14.13% <0.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ot/core/query/pruner/ColumnValueSegmentPruner.java 76.04% <100.00%> (+0.26%) ⬆️
...pinot/core/query/pruner/SegmentPrunerProvider.java 84.61% <100.00%> (+17.94%) ⬆️
.../pinot/core/query/pruner/SegmentPrunerService.java 100.00% <100.00%> (ø)
...core/query/pruner/SelectionQuerySegmentPruner.java 85.88% <100.00%> (-0.49%) ⬇️
...ller/helix/core/minion/TaskTypeMetricsUpdater.java 80.00% <0.00%> (-20.00%) ⬇️
...nt/local/startree/v2/store/StarTreeDataSource.java 40.00% <0.00%> (-13.34%) ⬇️
...ore/query/scheduler/resources/ResourceManager.java 84.00% <0.00%> (-12.00%) ⬇️
...ache/pinot/core/operator/docidsets/OrDocIdSet.java 86.36% <0.00%> (-11.37%) ⬇️
...or/transform/function/IsNullTransformFunction.java 78.57% <0.00%> (-7.15%) ⬇️
...ot/segment/local/startree/OffHeapStarTreeNode.java 65.90% <0.00%> (-6.82%) ⬇️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 11a060c...0e7298a. Read the comment docs.

@gortiz
Copy link
Contributor

gortiz commented May 27, 2022

LGTM

@richardstartin richardstartin force-pushed the simplify-segment-pruners branch 2 times, most recently from 85e485e to 51231f2 Compare May 27, 2022 15:23
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants