Skip to content

Commit

Permalink
Set page LSN when reconstructing VM in page server (#7935)
Browse files Browse the repository at this point in the history
## Problem

Page LSN is not set while VM update.
May be reason of test_vm_bits flukyness.
Buit more serious issues can be also caused by wrong LSN.

Related: #7935

## Summary of changes

- In `apply_in_neon`, set the LSN bytes when applying records of type
`ClearVisibilityMapFlags`
  • Loading branch information
knizhnik committed Jun 4, 2024
1 parent 00032c9 commit 387a368
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 5 deletions.
4 changes: 2 additions & 2 deletions pageserver/src/tenant.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3992,8 +3992,8 @@ pub(crate) mod harness {
let base_img = base_img.expect("Neon WAL redo requires base image").1;
let mut page = BytesMut::new();
page.extend_from_slice(&base_img);
for (_record_lsn, record) in records {
apply_neon::apply_in_neon(&record, key, &mut page)?;
for (record_lsn, record) in records {
apply_neon::apply_in_neon(&record, record_lsn, key, &mut page)?;
}
Ok(page.freeze())
} else {
Expand Down
4 changes: 2 additions & 2 deletions pageserver/src/walredo.rs
Original file line number Diff line number Diff line change
Expand Up @@ -361,10 +361,10 @@ impl PostgresRedoManager {
&self,
key: Key,
page: &mut BytesMut,
_record_lsn: Lsn,
record_lsn: Lsn,
record: &NeonWalRecord,
) -> anyhow::Result<()> {
apply_neon::apply_in_neon(record, key, page)?;
apply_neon::apply_in_neon(record, record_lsn, key, page)?;

Ok(())
}
Expand Down
6 changes: 5 additions & 1 deletion pageserver/src/walredo/apply_neon.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ use postgres_ffi::v14::nonrelfile_utils::{
use postgres_ffi::BLCKSZ;
use tracing::*;
use utils::bin_ser::BeSer;
use utils::lsn::Lsn;

/// Can this request be served by neon redo functions
/// or we need to pass it to wal-redo postgres process?
Expand All @@ -32,6 +33,7 @@ pub(crate) fn can_apply_in_neon(rec: &NeonWalRecord) -> bool {

pub(crate) fn apply_in_neon(
record: &NeonWalRecord,
lsn: Lsn,
key: Key,
page: &mut BytesMut,
) -> Result<(), anyhow::Error> {
Expand Down Expand Up @@ -67,6 +69,7 @@ pub(crate) fn apply_in_neon(
let map = &mut page[pg_constants::MAXALIGN_SIZE_OF_PAGE_HEADER_DATA..];

map[map_byte as usize] &= !(flags << map_offset);
postgres_ffi::page_set_lsn(page, lsn);
}

// Repeat for 'old_heap_blkno', if any
Expand All @@ -80,6 +83,7 @@ pub(crate) fn apply_in_neon(
let map = &mut page[pg_constants::MAXALIGN_SIZE_OF_PAGE_HEADER_DATA..];

map[map_byte as usize] &= !(flags << map_offset);
postgres_ffi::page_set_lsn(page, lsn);
}
}
// Non-relational WAL records are handled here, with custom code that has the
Expand Down Expand Up @@ -285,7 +289,7 @@ mod test {
let mut page = BytesMut::from_iter(base_image);

for record in deltas {
apply_in_neon(&record, file_path, &mut page)?;
apply_in_neon(&record, Lsn(8), file_path, &mut page)?;
}

let reconstructed = AuxFilesDirectory::des(&page)?;
Expand Down

1 comment on commit 387a368

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3262 tests run: 3109 passed, 1 failed, 152 skipped (full report)


Failures on Postgres 16

  • test_sharding_backpressure: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_sharding_backpressure[release-pg16]"
Flaky tests (3)

Postgres 16

Postgres 15

  • test_statvfs_pressure_usage: debug

Postgres 14

  • test_pageserver_restarts_under_worload: release

Test coverage report is not available

The comment gets automatically updated with the latest test results
387a368 at 2024-06-04T10:20:59.517Z :recycle:

Please sign in to comment.