If the wal file is truncated, don't attempt to read off the end #2576

benjaminwinger · 2023-12-12T22:53:11Z

Fixes #2391 (hopefully). As far as I can tell, the database is fine, it's just that the wal file is truncated and we crash when determining whether or not a checkpoint was completed before the interruption since there's a dangling reference to the next page containing the header records.

I was able to start up a database which it was previously failing on successfully, and I've been unable to reproduce it with the changes.

@Riolku, could you double-check this fixes the issue?

TODO: this could be tested by finding an example which writes multiple pages to the wal file after a commit where we skip the checkpoint, manually truncate the wal file to just one page, and then attempt to reopen the database.

codecov · 2023-12-12T23:02:21Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (60ebb05) 93.16% compared to head (d76780c) 93.15%.
Report is 2 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2576      +/-   ##
==========================================
- Coverage   93.16%   93.15%   -0.01%     
==========================================
  Files        1027     1027              
  Lines       38418    38416       -2     
==========================================
- Hits        35793    35788       -5     
- Misses       2625     2628       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

benjaminwinger · 2023-12-13T19:00:05Z

Test added, and I added a fix for another issue, where the commit being written before the shadow pages are flushed can lead to a failed recovery (which I noticed when accidentally simulating it by truncating the wal file when it contains the commit within the first page). I think it can only occur if the interruption happens between when the headers are flushed and when the shadow pages are flushed, since the commit shouldn't be flushed until the next time WAL::flushAllPages is called, so maybe a better solution than flushing before the commit would be to invert the ordering of the flushes to make sure the header containing the commit is flushed last, but I'm not sure if there's another situation where the commit may be flushed first.

I don't think it's feasible to test though. Adding a test which kills the database process after a certain amount of time and attempts to recover might trigger it occasionally, but it won't be reliable.

ray6080 · 2023-12-13T22:55:53Z

This is one way to fix this, but I think the best way might be to change the logic of logging wal records, we should always cache those records in memory, not write them out to wal file before we call flushAllPages. This makes the buffer inside wal grow larger, but it simplifies the logic. We don't flush wal records or shadow pages and do fsync until TransactionManager is committing the transaction.

benjaminwinger · 2023-12-14T15:28:10Z

always cache those records in memory, not write them out to wal file before we call flushAllPages. This makes the buffer inside wal grow larger, but it simplifies the logic. We don't flush wal records or shadow pages and do fsync until TransactionManager is committing the transaction.

Couldn't that still end up with corrupt data if the interruption occurs during the flush? It should make it much less likely, compared to writing some data before a long copy transaction and the rest afterwards, but I'm not sure it solves the issue by itself, as I think only individual page writes are atomic.

ray6080 · 2023-12-14T20:07:37Z

always cache those records in memory, not write them out to wal file before we call flushAllPages. This makes the buffer inside wal grow larger, but it simplifies the logic. We don't flush wal records or shadow pages and do fsync until TransactionManager is committing the transaction.

Couldn't that still end up with corrupt data if the interruption occurs during the flush? It should make it much less likely, compared to writing some data before a long copy transaction and the rest afterwards, but I'm not sure it solves the issue by itself, as I think only individual page writes are atomic.

That's true, we need to also make sure the final flush is atomic in this way.

Riolku · 2023-12-14T20:36:40Z

Seems to work, thanks!

If the wal file is truncated, don't attempt to read off the end

d76780c

benjaminwinger force-pushed the wal-header-fix branch from 49ba271 to d76780c Compare December 13, 2023 18:47

ray6080 approved these changes Dec 14, 2023

View reviewed changes

benjaminwinger merged commit 28bd991 into master Dec 19, 2023
14 checks passed

benjaminwinger deleted the wal-header-fix branch December 19, 2023 17:04

benjaminwinger mentioned this pull request Dec 19, 2023

Update TruncatedWalTest to use the new filesystem API #2597

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If the wal file is truncated, don't attempt to read off the end #2576

If the wal file is truncated, don't attempt to read off the end #2576

benjaminwinger commented Dec 12, 2023 •

edited

Loading

codecov bot commented Dec 12, 2023 •

edited

Loading

benjaminwinger commented Dec 13, 2023

ray6080 commented Dec 13, 2023 •

edited

Loading

benjaminwinger commented Dec 14, 2023

ray6080 commented Dec 14, 2023

Riolku commented Dec 14, 2023

If the wal file is truncated, don't attempt to read off the end #2576

If the wal file is truncated, don't attempt to read off the end #2576

Conversation

benjaminwinger commented Dec 12, 2023 • edited Loading

codecov bot commented Dec 12, 2023 • edited Loading

Codecov Report

benjaminwinger commented Dec 13, 2023

ray6080 commented Dec 13, 2023 • edited Loading

benjaminwinger commented Dec 14, 2023

ray6080 commented Dec 14, 2023

Riolku commented Dec 14, 2023

benjaminwinger commented Dec 12, 2023 •

edited

Loading

codecov bot commented Dec 12, 2023 •

edited

Loading

ray6080 commented Dec 13, 2023 •

edited

Loading