-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad request: invalid request with request LSN 0/B0AFDD0 and not_modified_since 0/B0AFE80 #7791
Comments
In
First observation: In a debug build, we should hit that assertion instead of hitting essentially the same check in the pageserver. I think you could hit that case in the follow scenario:
In step 4, it would see last_written_lsn as already updated for the WAL record's LSN. But because the WAL record hasn't been fully replayed yet, GetXLogReplayRecPtr() returns the LSN before that WAL record. We have this later in
That seems to addressing the same issue, but the damage has already been done, because we already set request_lsns.request_lsn to too low a value. We should create a test case for this. Might be tricky.. |
Ok, I was able to write a python test to reproduce this. |
I don't have a fix yet, and don't fully understand the problem yet, but looking at the code, I can two things which look a bit suspicious:
This is the function that checks if a page needs to be read into memory in WAL redo to update the in-cache copy of the page, or if we can just skip the page if it's not in cache. It's necessary to update the last-written LSN when the page isn't cached, but it's pretty surprising that we update it also when it is. If the page gets evicted from the page cache between this and the ReadBuffer() call to actually read it into the page cache, will the startup process request the already-updated version of the page from the pageserver? Is that OK? The redo functions can generally handle seeing a page version that's newer than the record we're replaying, because that's normal during crash recovery. But might something get confused if that happens during hot standby? Also, what if a backend reads the same page, with the already updated last-written LSN, before the startup process does? Is that OK? Also, because it's just a cache, after we update the last-written cache for one page, we might see that newer value for reads on different pages too. Unlikely in practice, but the cache might overflow at the crucial moment so that the new last-written value makes its way to the per-relation or global last-written LSN value, and is used for a subsequent read of a different page. Maybe that's OK? I'm not sure. If a WAL record modifies two pages, is it possible that a backend would read the old version of one of the pages, and new version of another? That's usually handled by the redo functions acquiring and holding locks on pages in the right order, but might this allow you to see the new version of one page before the redo function has acquired the lock on it? Could we update the last-written LSN value only when we skip updating the page? Any disadvantage? In short, this feels wrong, but I cannot put a finger on a exact scenario where it would lead to problems.
|
Originally this code was written by @MMeent and looks slightly different: it set lowland to Looking once again at the code and reading your comments, I have to agree that setting lwlsn to Should we ever set lwlsn when page is absent? Usually we update lwlsn when page is evicted in
Please notice that It seems to me that we should remove this check and remove assignment of lwlsn for present pages. |
I wrote my comment above before reading 2. And come to the same conclusion. |
Don't set last-written LSN of a page when the record is replayed, only when the page is evicted from cache. For comparison, we don't update the last-written LSN on every page modification on the primary either, only when the page is evicted. Do update the last-written LSN when the page update is skipped in WAL redo, however. In neon_get_request_lsns(), don't be surprised if the last-written LSN is equal to the record being replayed. Use the LSN of the record being replayed as the request LSN in that case. Add a long comment explaining how that can happen. In neon_wallog_page, update last-written LSN also when Shutdown has been requested. We might still fetch and evict pages for a while, after shutdown has been requested, so we better continue to do that correctly. Enable the check that we don't evict a page with zero LSN also in standby, but make it a LOG message instead of PANIC Fixes issue #7791
Don't set last-written LSN of a page when the record is replayed, only when the page is evicted from cache. For comparison, we don't update the last-written LSN on every page modification on the primary either, only when the page is evicted. Do update the last-written LSN when the page update is skipped in WAL redo, however. In neon_get_request_lsns(), don't be surprised if the last-written LSN is equal to the record being replayed. Use the LSN of the record being replayed as the request LSN in that case. Add a long comment explaining how that can happen. In neon_wallog_page, update last-written LSN also when Shutdown has been requested. We might still fetch and evict pages for a while, after shutdown has been requested, so we better continue to do that correctly. Enable the check that we don't evict a page with zero LSN also in standby, but make it a LOG message instead of PANIC Fixes issue #7791
Don't set last-written LSN of a page when the record is replayed, only when the page is evicted from cache. For comparison, we don't update the last-written LSN on every page modification on the primary either, only when the page is evicted. Do update the last-written LSN when the page update is skipped in WAL redo, however. In neon_get_request_lsns(), don't be surprised if the last-written LSN is equal to the record being replayed. Use the LSN of the record being replayed as the request LSN in that case. Add a long comment explaining how that can happen. In neon_wallog_page, update last-written LSN also when Shutdown has been requested. We might still fetch and evict pages for a while, after shutdown has been requested, so we better continue to do that correctly. Enable the check that we don't evict a page with zero LSN also in standby, but make it a LOG message instead of PANIC Fixes issue #7791
This was fixed by #7795 |
Regression test failure seen in read replica:
@knizhnik can you update this description with any other details, please?
The text was updated successfully, but these errors were encountered: