-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constant compression #2516
Constant compression #2516
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #2516 +/- ##
==========================================
+ Coverage 92.83% 92.91% +0.07%
==========================================
Files 1025 1025
Lines 38324 38401 +77
==========================================
+ Hits 35579 35679 +100
+ Misses 2745 2722 -23 ☔ View full report in Codecov by Sentry. |
bdccfcb
to
a3c52c8
Compare
Benchmark actually failed, despite the job status. The storage is not compatible with the old version since I'd changed the compression ID numbers to turn them into a bitfield. Maybe that should be changed anyway, since what's stored in the file doesn't need to be a bitfield, and that limits the total number of values. We could have a separate bitfield for configuration, but since it's not being used outside of the ColumnChunk, maybe it's a better idea to revert that part for now and leave more fine-grained compression settings to a later time. |
5108229
to
4a15a0f
Compare
1570f46
to
de4f5ff
Compare
Removed various dedicated functions in place of using the compression interfaces
Disabled for nulls and the Rel table CSR offset column
de4f5ff
to
9147b85
Compare
Constant compressed columns use
INVALID_PAGE_IDX
as their page index in the metadata, and reading from them uses the normal functions, but there's a special case for when we try to read from anINVALID_PAGE_IDX
.To get it working for nulls I had to set up out of place commits for the null column. This always happens when the parent column does an out of place commit (unnecessary, but it would take a lot of work to de-couple that I think), but we may now need to do an out-of-place commit for the null column when doing an in-place commit on its parent column. As such, I've removed in-place updates to the null column when we do in-place updates to the parent column, and have it otherwise doing an out-of-place update on the null column only if the main column does an in-place update to avoid duplicating work.
I'd added some bounds checking, and more fine-grained control over the compression choices at the ColumChunk level when debugging, which I've left in, but could be removed since I was able to get it working for every fixed-sized type.