-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up index in-mem overflow file #2564
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #2564 +/- ##
==========================================
+ Coverage 92.90% 93.16% +0.25%
==========================================
Files 1026 1027 +1
Lines 38579 38480 -99
==========================================
+ Hits 35842 35849 +7
+ Misses 2737 2631 -106 ☔ View full report in Codecov by Sentry. |
@@ -59,147 +35,40 @@ ku_string_t InMemOverflowFile::appendString(const char* rawString) { | |||
if (length > BufferPoolConstants::PAGE_4KB_SIZE) { | |||
throw CopyException(ExceptionMessage::overLargeStringPKValueException(length)); | |||
} | |||
std::unique_lock lck{lock}; | |||
// Allocate a new page if necessary. | |||
if (nextOffsetInPageToAppend + length >= BufferPoolConstants::PAGE_4KB_SIZE) { | |||
addANewPage(); | |||
nextOffsetInPageToAppend = 0; | |||
nextPageIdxToAppend++; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we getting away with removing this lock? This seems like a race since addANewPage
doesn't lock either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the argument that InMemOverflowFile is always accessed while holding a lock further up anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Only HashIndex is using this, and inside the overflow file, we don't do parallel insertions anymore. I should add a comment that this is no longer thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. We might do parallel insertions in the future though, at which point we would lock inside the index instead of the overflow file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would refactor all of these to something else different by then. I'm not sure which level should take the lock, but by the time, it will become clear. As for now, just clean up legacy first.
Also, can you add a summary in the commit message + PR description that explains the actual change, more in depth? |
Sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea if this change has a performance benefit?
It might have some for string insertions due to removing of a lock. But I cannot observe it through COPY NODE, as it is done in a serial way anyways, so really no contention on the removed lock there. |
This PR cleans up a bunch of useless functions and designs such as nullmask inside
InMemOverflowFile
andInMemFile
. They were originally used by Column and List. But those were all refactored to NodeGroup based implementation.Also, merged these two classes together into
InMemFile
, as there is really no need for the separation ofInMemFile
andInMemOverflowFile
.HashIndexBuilder
is the only user now. So I simplified the code to be non-thread-safe, as the user is not requiring a thread-safe one.