Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Upsert deletion for TTL: construct queryableDocIds when adding segments out of TTL #11791

Merged
merged 5 commits into from
Nov 3, 2023

Conversation

deemoliu
Copy link
Contributor

bugfix, upsert: construct queryableDocIds when adding segments out of TTL

Previously we don't handle cases that a upsert table with both upsert deletion and upsert ttl.
In this PR, fixed todo items, construct queryableDocIds when adding segments out of TTL.

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2023

Codecov Report

Merging #11791 (839e935) into master (d6f22ac) will increase coverage by 26.57%.
Report is 27 commits behind head on master.
The diff coverage is 82.35%.

@@              Coverage Diff              @@
##             master   #11791       +/-   ##
=============================================
+ Coverage     34.84%   61.42%   +26.57%     
- Complexity      946     1147      +201     
=============================================
  Files          2299     2378       +79     
  Lines        124763   128880     +4117     
  Branches      19289    19927      +638     
=============================================
+ Hits          43479    79166    +35687     
+ Misses        78243    43997    -34246     
- Partials       3041     5717     +2676     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 61.37% <82.35%> (+26.55%) ⬆️
java-21 61.30% <82.35%> (+26.57%) ⬆️
skip-bytebuffers-false 61.40% <82.35%> (+26.55%) ⬆️
skip-bytebuffers-true 34.64% <0.00%> (-0.07%) ⬇️
temurin 61.42% <82.35%> (+26.57%) ⬆️
unittests 61.42% <82.35%> (+14.78%) ⬆️
unittests1 46.62% <0.00%> (-0.02%) ⬇️
unittests2 27.65% <82.35%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...he/pinot/segment/local/utils/TableConfigUtils.java 77.77% <ø> (+77.77%) ⬆️
...cal/upsert/BasePartitionUpsertMetadataManager.java 42.78% <82.35%> (+42.78%) ⬆️

... and 853 files with indirect coverage changes

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect. Please add a test to ensure we can get the correct behavior

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good in general. We can further optimize it

queryableDocIds = new MutableRoaringBitmap();
// we allow passing null values for primaryKey columns and comparisonColumns to read deleteColumn to avoid paying
// the overhead of reading primary key, comparison column etc. They will never be null in other scenarios.
UpsertUtils.RecordInfoReader recordInfoReader =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest not modifying the RecordInfoReader because it is not designed for this purpose. We can simply create a PinotSegmentColumnReader for _deleteRecordColumn, then iterate over the validDocIds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when iterate the value of _deleteRecordColumn, can deletedRecord be null? can this be further optimize?

        if (deleteRecordColumnReader.isNull(doc_id)) {
          // defaultNullValue for deleteColumn is "false"
          queryableDocIds.add(doc_id);
        } else if (!(Boolean) deleteRecordColumnReader.getValue(doc_id)) {
          queryableDocIds.add(doc_id);
        }

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Applied a clean up commit

@Jackie-Jiang Jackie-Jiang merged commit cf41a0e into apache:master Nov 3, 2023
18 of 19 checks passed
@deemoliu
Copy link
Contributor Author

deemoliu commented Nov 3, 2023

thanks @Jackie-Jiang!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants