-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow up to 4GB per bitmap index #8796
Conversation
return new ImmutableRoaringBitmap(_bitmapBuffer.toDirectByteBuffer(offset - _firstOffset, length)); | ||
long offset = getOffset(dictId); | ||
long length = getOffset(dictId + 1) - offset; | ||
return new ImmutableRoaringBitmap(_bitmapBuffer.toDirectByteBuffer(offset - _firstOffset, (int) length)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the maximum size of a RoaringBitmap
is 2^16 * (8KB + 2B) ~ 512MB so the cast to int never overflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to add this comments of analysis to the code above as well?
This isn’t truly backward compatible - all servers need to be upgraded before the first segment with a bitmap index larger than 2GB is completed. |
Codecov Report
@@ Coverage Diff @@
## master #8796 +/- ##
============================================
- Coverage 69.69% 62.81% -6.89%
+ Complexity 4617 4602 -15
============================================
Files 1733 1687 -46
Lines 91188 89200 -1988
Branches 13630 13409 -221
============================================
- Hits 63555 56027 -7528
- Misses 23216 29128 +5912
+ Partials 4417 4045 -372
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we check overflow currently? I feel the old code won't fail on the writer side, but will fail on the reader side when the size is over 2GB?
I don't think overflow is being checked. This is the stack trace I saw while attempting to generate a very large index (2GB+). This was more of a test rather than a production usecase, so we don't really need the limit to be increased, but if it can be done easily then no harm. Although, it would be good to check the overflow and add an error message.
|
Currently the code fails on the writer side, but would also fail on the reader side. The right way to evolve this is to make these changes, but continue failing beyond 2GB on the writer side, then increase to 4GB the version after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be good to add some comments for future reference, but otherwise looks good to me. The compatibility issue doesn't appear to be concern (at least on the clusters I work with) since all the index files are less than 2GB.
Shall we add a test where we generate 2 bitmap inv indexes > 2GB and < 2GB respectively and the reader code changed in this PR can read both indexes correctly ? |
< 2GB is already tested. Tests could be added for > 2GB, with the downside of needing to create large files. |
@jackjlli this is not backward incompatible |
I added the label as you mentioned the following sentence:
I think the label is used for identifying any action needed if a newer version of Pinot is bumped in the service, so that ppl are aware of those actions. Feel free to update or remove the label if you feel it's unnecessary. |
Segments produced by the existing code can still be consumed, but the existing code isn’t forward compatible with a limit increase. This change can be made safely in two stages as mentioned above: 1) release these code changes but enforce a 2GB limit, which makes the reader code forward compatible with a limit increase but can’t produce unreadable segments, then 2) relax the 2GB limit in the next release. This just requires upgrading users not to skip versions. |
This doubles the permissible size of a bitmap index in a backward compatible way by using the extra bit which is currently always clear because writing fails when the index size exceeds 2GB. An index size in excess of 2GB isn't recommended, but this is more tolerant than throwing an exception, and is a backward compatible change.