Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insert into the hash index builder one chunk at a time #2997

Merged
merged 1 commit into from
Mar 13, 2024

Conversation

benjaminwinger
Copy link
Collaborator

First part of #2938.

There is a small performance improvement, though it's obscured somewhat by the amount of variance I was seeing between runs. It was roughly 100ms improvement (copying a table with just 60 million integer primary keys, went from ~2.3 to ~2.2 seconds).

This replaces the one at a time append in HashIndexBuilder with an append function that takes a StaticVector (since that's what's used in the IndexBuilder's queues). Resizing the index and calculating the hashes is done on the 1024 values in the StaticVector all at once, before each value is inserted one by one.
I also removed the second type from the HashIndexBuilder since the StaticVector stores a std::string and it makes more sense to just hardcode places which take a std::string/std::string_view using std::conditional than to add more template parameters (and there is only really one variable parameter anyway).

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

Attention: Patch coverage is 96.29630% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 93.28%. Comparing base (0c26056) to head (af50489).

Files Patch % Lines
src/include/storage/index/hash_index_builder.h 85.71% 1 Missing ⚠️
...rc/processor/operator/persistent/index_builder.cpp 90.90% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2997   +/-   ##
=======================================
  Coverage   93.28%   93.28%           
=======================================
  Files        1128     1128           
  Lines       42947    42949    +2     
=======================================
+ Hits        40062    40067    +5     
+ Misses       2885     2882    -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/include/storage/index/hash_index_builder.h Outdated Show resolved Hide resolved
src/include/storage/index/hash_index_builder.h Outdated Show resolved Hide resolved
src/include/storage/index/hash_index_builder.h Outdated Show resolved Hide resolved
@benjaminwinger benjaminwinger force-pushed the hash-index-builder-chunks branch 2 times, most recently from 654065f to 72b0871 Compare March 12, 2024 20:02
@benjaminwinger benjaminwinger merged commit d8487a0 into master Mar 13, 2024
16 checks passed
@benjaminwinger benjaminwinger deleted the hash-index-builder-chunks branch March 13, 2024 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants