Reclaim empty overflow slots in memory hash index #3438

benjaminwinger · 2024-05-02T19:56:29Z

When splitting, move empty overflow slots into a global linked list and re-use them before allocating new slots.

I started working on this to try and simplify how deletions will work in the in memory hash index (needed to unify the hash index local storage for copies and inserts/deletions), as removing empty overflow slots from the end of the slot chains makes it easier to find the last entry in a slot without having to backtrack if the last slot is empty.
This ended up being a little more complicated than I expected, but for a hash index of 60 million consecutive integers it reduces memory use from the slots from roughly 2.28GB to 1.75GB, reducing the number of overflow slots by more than half, and it seems to slightly increase performance (presumably because it reduces the number of allocations).

Something similar could be done for disk slots (see last TODO in #2938 (comment)), the main difference being that disk slots may have gaps, but the gaps could be removed when splitting.

This avoids breaking the storage format by adding the new field (which is not used by the on-disk index) to the end of the hash index header.

src/storage/index/in_mem_hash_index.cpp

When splitting, move empty overflow slots into a global linked list and re-use them before allocating new slots

benjaminwinger force-pushed the hash-index-reclaiming branch 2 times, most recently from 658ad60 to 3603c9e Compare May 3, 2024 14:47

ray6080 approved these changes May 6, 2024

View reviewed changes

src/storage/index/in_mem_hash_index.cpp Outdated Show resolved Hide resolved

benjaminwinger force-pushed the hash-index-reclaiming branch from 3603c9e to d136a6d Compare May 7, 2024 16:06

Reclaim empty overflow slots in memory hash index

62610ee

When splitting, move empty overflow slots into a global linked list and re-use them before allocating new slots

benjaminwinger force-pushed the hash-index-reclaiming branch from d136a6d to 62610ee Compare May 8, 2024 13:28

benjaminwinger merged commit ea1e798 into master May 9, 2024
18 checks passed

benjaminwinger deleted the hash-index-reclaiming branch May 9, 2024 14:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reclaim empty overflow slots in memory hash index #3438

Reclaim empty overflow slots in memory hash index #3438

benjaminwinger commented May 2, 2024 •

edited

Loading

Reclaim empty overflow slots in memory hash index #3438

Reclaim empty overflow slots in memory hash index #3438

Conversation

benjaminwinger commented May 2, 2024 • edited Loading

benjaminwinger commented May 2, 2024 •

edited

Loading