Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the wal file is truncated, don't attempt to read off the end #2576

Merged
merged 1 commit into from
Dec 19, 2023

Conversation

benjaminwinger
Copy link
Collaborator

@benjaminwinger benjaminwinger commented Dec 12, 2023

Fixes #2391 (hopefully). As far as I can tell, the database is fine, it's just that the wal file is truncated and we crash when determining whether or not a checkpoint was completed before the interruption since there's a dangling reference to the next page containing the header records.

I was able to start up a database which it was previously failing on successfully, and I've been unable to reproduce it with the changes.

@Riolku, could you double-check this fixes the issue?

TODO: this could be tested by finding an example which writes multiple pages to the wal file after a commit where we skip the checkpoint, manually truncate the wal file to just one page, and then attempt to reopen the database.

Copy link

codecov bot commented Dec 12, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (60ebb05) 93.16% compared to head (d76780c) 93.15%.
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2576      +/-   ##
==========================================
- Coverage   93.16%   93.15%   -0.01%     
==========================================
  Files        1027     1027              
  Lines       38418    38416       -2     
==========================================
- Hits        35793    35788       -5     
- Misses       2625     2628       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@benjaminwinger
Copy link
Collaborator Author

Test added, and I added a fix for another issue, where the commit being written before the shadow pages are flushed can lead to a failed recovery (which I noticed when accidentally simulating it by truncating the wal file when it contains the commit within the first page). I think it can only occur if the interruption happens between when the headers are flushed and when the shadow pages are flushed, since the commit shouldn't be flushed until the next time WAL::flushAllPages is called, so maybe a better solution than flushing before the commit would be to invert the ordering of the flushes to make sure the header containing the commit is flushed last, but I'm not sure if there's another situation where the commit may be flushed first.

I don't think it's feasible to test though. Adding a test which kills the database process after a certain amount of time and attempts to recover might trigger it occasionally, but it won't be reliable.

@ray6080
Copy link
Contributor

ray6080 commented Dec 13, 2023

This is one way to fix this, but I think the best way might be to change the logic of logging wal records, we should always cache those records in memory, not write them out to wal file before we call flushAllPages. This makes the buffer inside wal grow larger, but it simplifies the logic. We don't flush wal records or shadow pages and do fsync until TransactionManager is committing the transaction.

@benjaminwinger
Copy link
Collaborator Author

always cache those records in memory, not write them out to wal file before we call flushAllPages. This makes the buffer inside wal grow larger, but it simplifies the logic. We don't flush wal records or shadow pages and do fsync until TransactionManager is committing the transaction.

Couldn't that still end up with corrupt data if the interruption occurs during the flush? It should make it much less likely, compared to writing some data before a long copy transaction and the rest afterwards, but I'm not sure it solves the issue by itself, as I think only individual page writes are atomic.

@ray6080
Copy link
Contributor

ray6080 commented Dec 14, 2023

always cache those records in memory, not write them out to wal file before we call flushAllPages. This makes the buffer inside wal grow larger, but it simplifies the logic. We don't flush wal records or shadow pages and do fsync until TransactionManager is committing the transaction.

Couldn't that still end up with corrupt data if the interruption occurs during the flush? It should make it much less likely, compared to writing some data before a long copy transaction and the rest afterwards, but I'm not sure it solves the issue by itself, as I think only individual page writes are atomic.

That's true, we need to also make sure the final flush is atomic in this way.

@Riolku
Copy link
Contributor

Riolku commented Dec 14, 2023

Seems to work, thanks!

@benjaminwinger benjaminwinger merged commit 28bd991 into master Dec 19, 2023
14 checks passed
@benjaminwinger benjaminwinger deleted the wal-header-fix branch December 19, 2023 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Killing Database during COPY causes Corruption
3 participants