Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix constant compression in-place check for bools #3211

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

benjaminwinger
Copy link
Collaborator

Bools are stored packed in column chunks, so indexing into the data using the size in bytes of the type will not work.

This is done as a quick fix, however I think it would be a good idea to move away from passing around raw uint8_t* when passing around type-generic data. Instead, I think we should create abstractions for storing and accessing the raw buffers. That would let us more easily distinguish between, for example, a ValueVector buffer and a ColumnChunk buffer (e.g. the former stores 1 bool per byte, and the latter 1 bool per bit), and also make sure we handle these edge cases.

Copy link

codecov bot commented Apr 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.27%. Comparing base (fa0ef79) to head (e35aec3).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3211      +/-   ##
==========================================
+ Coverage   92.26%   92.27%   +0.01%     
==========================================
  Files        1161     1161              
  Lines       44129    44143      +14     
==========================================
+ Hits        40714    40732      +18     
+ Misses       3415     3411       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@benjaminwinger
Copy link
Collaborator Author

benjaminwinger commented Apr 4, 2024

Apparently this issue was already handled separately for Nulls (which this change initially broke). I don't really like that the implementations diverged.
I've removed some of the NullColumn functions which have become almost identical to the base Column versions.

Bools are stored packed in column chunks, so indexing into the data using the size in bytes of the type will not work.
@andyfengHKU andyfengHKU merged commit 2100fa3 into master Apr 5, 2024
20 checks passed
@andyfengHKU andyfengHKU deleted the constant-bool-fix branch April 5, 2024 03:37
@bigluck
Copy link

bigluck commented Apr 5, 2024

Thanks @benjaminwinger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants