Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate shadow pages from wal records and rework wal to use serializer #3204

Merged
merged 3 commits into from
Apr 9, 2024

Conversation

ray6080
Copy link
Contributor

@ray6080 ray6080 commented Apr 3, 2024

No description provided.

Copy link

codecov bot commented Apr 3, 2024

Codecov Report

Attention: Patch coverage is 94.87179% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 92.34%. Comparing base (d946982) to head (f9e839a).
Report is 8 commits behind head on master.

❗ Current head f9e839a differs from pull request most recent head 5840b68. Consider uploading reports for the commit 5840b68 to get more accurate results

Files Patch % Lines
src/storage/wal_replayer.cpp 88.23% 6 Missing ⚠️
src/storage/wal/wal.cpp 94.00% 3 Missing ⚠️
src/include/storage/wal/wal_record.h 93.10% 2 Missing ⚠️
src/storage/wal/wal_record.cpp 98.26% 2 Missing ⚠️
src/include/common/cast.h 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3204      +/-   ##
==========================================
+ Coverage   92.28%   92.34%   +0.05%     
==========================================
  Files        1161     1162       +1     
  Lines       44150    44192      +42     
==========================================
+ Hits        40744    40808      +64     
+ Misses       3406     3384      -22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}
return newVal;
} catch (std::bad_cast& e) {
KU_ASSERT(false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of the assert here? It doesn't seem like it would be any clearer of an error than the exception.
Is is just so that it gets caught if we catch kuzu::common::Exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that it's easier to debug with ku_assert than catching the std::bad_cast exception manually.

@@ -58,9 +63,17 @@ void BufferedFileReader::read(uint8_t* data, uint64_t size) {
}
}

bool BufferedFileReader::finished() {
return bufferOffset >= bufferSize && fileSize <= fileOffset;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this always going to be correct? What if the last page in the file is only partially filled?
From where it's being used before it seems to me that it will continue reading the empty data at the end of the page, which will mess up the check for whether or not the last record is a WAL.
I think there should be some way of tracking how many records are in the WAL, or alternatively have a record type of 0 be used to mark invalid records, and then as long as the buffer gets reset to zero (which the bufferedfilewriter is doing) then it can break after it reaches a record with a type of 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bufferSize = std::min(fileSize - fileOffset, BUFFER_SIZE); bufferSize is bounded by the actual fileSize. so it won't read empty data at the end of the last page if it's partially filled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensuring wal is not corrupted is another thing we need to handle. I feel we might need to do checksum for records.

KU_UNREACHABLE;
}
}
walRecord->type = type;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant since it's also being set in the constructors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. deserializer uses default constructor which doesn't take in type. though a better way is to remove default constructor. I will do that in another pr later.

"Unrecognized WAL record type inside WALReplayer::replay. recordType: " +
walRecordTypeToString(walRecord.recordType));
throw RuntimeException("Unrecognized WAL record type inside WALReplayer::replay. type: " +
walRecordTypeToString(walRecord.type));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't walRecordTypeToString always produce KU_UNREACHABLE here? I think it would only be used if one of the known types is missing here (for that matter, what's CREATE_REL_TABLE_GROUP_RECORD for? It doesn't seem to be used).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. nice catch here. I cleaned up these toString functions. should just put KU_UNREACHABLE here.

Removed CREATE_REL_TABLE_GROUP_RECORD. not used for now.

@@ -135,5 +135,5 @@ TEST_F(NodeInsertionDeletionTests, TruncatedWalTest) {
walFileInfo->truncate(BufferPoolConstants::PAGE_4KB_SIZE);
}
// Re-open database
database = std::make_unique<Database>(databasePath);
EXPECT_THROW(database = std::make_unique<Database>(databasePath), RuntimeException);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's helpful to include this test if we expect it to fail now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will keep it for now as a test on malformed wal file. we should add more tests on malformed wal, and rework this later.

@ray6080 ray6080 merged commit 27a7b0b into master Apr 9, 2024
18 checks passed
@ray6080 ray6080 deleted the shadow-wal-separation branch April 9, 2024 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants