-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On-demand SLRU download: invalid request with request LSN 0/5000000 and not_modified_since 0/5000028 #8030
Comments
hlinnaka
added a commit
that referenced
this issue
Jun 12, 2024
If a standby is started right after switching to a new WAL segment, the request in the SLRU download request would point to the beginning of the segment (e.g. 0/5000000), while the not-modified-since LSN would point to just after the page header (e.g. 0/5000028). It's effectively the same position, as there cannot be any WAL records in between, but the pageserver rightly errors out on any request where the request LSN < not-modified since LSN. To fix, round down the not-modified since LSN to the beginning of the page like the request LSN. Fixes issue #8030
hlinnaka
added a commit
that referenced
this issue
Jun 12, 2024
If a standby is started right after switching to a new WAL segment, the request in the SLRU download request would point to the beginning of the segment (e.g. 0/5000000), while the not-modified-since LSN would point to just after the page header (e.g. 0/5000028). It's effectively the same position, as there cannot be any WAL records in between, but the pageserver rightly errors out on any request where the request LSN < not-modified since LSN. To fix, round down the not-modified since LSN to the beginning of the page like the request LSN. Fixes issue #8030
hlinnaka
added a commit
that referenced
this issue
Jun 12, 2024
If a standby is started right after switching to a new WAL segment, the request in the SLRU download request would point to the beginning of the segment (e.g. 0/5000000), while the not-modified-since LSN would point to just after the page header (e.g. 0/5000028). It's effectively the same position, as there cannot be any WAL records in between, but the pageserver rightly errors out on any request where the request LSN < not-modified since LSN. To fix, round down the not-modified since LSN to the beginning of the page like the request LSN. Fixes issue #8030
hlinnaka
added a commit
that referenced
this issue
Jun 12, 2024
…ry (#8031) If a standby is started right after switching to a new WAL segment, the request in the SLRU download request would point to the beginning of the segment (e.g. 0/5000000), while the not-modified-since LSN would point to just after the page header (e.g. 0/5000028). It's effectively the same position, as there cannot be any WAL records in between, but the pageserver rightly errors out on any request where the request LSN < not-modified since LSN. To fix, round down the not-modified since LSN to the beginning of the page like the request LSN. Fixes issue #8030
save-buffer
pushed a commit
that referenced
this issue
Jun 17, 2024
…ry (#8031) If a standby is started right after switching to a new WAL segment, the request in the SLRU download request would point to the beginning of the segment (e.g. 0/5000000), while the not-modified-since LSN would point to just after the page header (e.g. 0/5000028). It's effectively the same position, as there cannot be any WAL records in between, but the pageserver rightly errors out on any request where the request LSN < not-modified since LSN. To fix, round down the not-modified since LSN to the beginning of the page like the request LSN. Fixes issue #8030
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Steps to reproduce
Run
test_import_at_2bil
test withlazy_slru_download
enabled:Expected result
Test runs succesfully
Actual result
Test fails:
Pageserver log:
Logs & initial investigation
Compute log:
(The WARNING is expected with this test)
WAL dump:
backup_manifest:
The request LSN is 0/5000000 and not_modified_since is 0/5000028. That's bogus,so you get an error; request LSN should always be >= not_modified since. However, those two values are effectively the same. 0/5000000 points to the beginning of the WAL page, while 0/5000028 points to just after the page header.
When called at standby startup, before it has replayed any records,
neon_read_slru_segment
function callsGetRedoStartLsn()
to get the not-modified-since LSN, andnm_adjust_lsn(GetRedoStartLsn())
to get the request LSN.The text was updated successfully, but these errors were encountered: