Skip to content

Commit

Permalink
pageserver: fix AUX key vectored get validation (#7018)
Browse files Browse the repository at this point in the history
## Problem
The value reconstruct of AUX_FILES_KEY from records is not deterministic
since it uses a hash map under the hood. This caused vectored get validation
failures when enabled in staging.

## Summary of changes
Deserialise AUX_FILES_KEY blobs comparing. All other keys should
reconstruct deterministically, so we simply compare the blobs.
  • Loading branch information
VladLazar committed Mar 5, 2024
1 parent f3e4f85 commit ae8468f
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 3 deletions.
2 changes: 1 addition & 1 deletion pageserver/src/pgdatadir_mapping.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1677,7 +1677,7 @@ struct RelDirectory {
rels: HashSet<(Oid, u8)>,
}

#[derive(Debug, Serialize, Deserialize, Default)]
#[derive(Debug, Serialize, Deserialize, Default, PartialEq)]
pub(crate) struct AuxFilesDirectory {
pub(crate) files: HashMap<String, Bytes>,
}
Expand Down
41 changes: 39 additions & 2 deletions pageserver/src/tenant/timeline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ use futures::stream::StreamExt;
use itertools::Itertools;
use once_cell::sync::Lazy;
use pageserver_api::{
key::AUX_FILES_KEY,
keyspace::KeySpaceAccum,
models::{
CompactionAlgorithm, DownloadRemoteLayersTaskInfo, DownloadRemoteLayersTaskSpawnRequest,
Expand Down Expand Up @@ -891,8 +892,7 @@ impl Timeline {
assert_eq!(seq_key, vec_key);
match (seq_res, vec_res) {
(Ok(seq_blob), Ok(vec_blob)) => {
assert_eq!(seq_blob, vec_blob,
"Image mismatch for key {seq_key} - keyspace={keyspace:?} lsn={lsn}");
Self::validate_key_equivalence(seq_key, &keyspace, lsn, seq_blob, vec_blob);
},
(Err(err), Ok(_)) => {
panic!(
Expand All @@ -911,6 +911,43 @@ impl Timeline {
}
}

fn validate_key_equivalence(
key: &Key,
keyspace: &KeySpace,
lsn: Lsn,
seq: &Bytes,
vec: &Bytes,
) {
use utils::bin_ser::BeSer;

if *key == AUX_FILES_KEY {
// The value reconstruct of AUX_FILES_KEY from records is not deterministic
// since it uses a hash map under the hood. Hence, deserialise both results
// before comparing.
let seq_aux_dir_res = AuxFilesDirectory::des(seq);
let vec_aux_dir_res = AuxFilesDirectory::des(vec);
match (&seq_aux_dir_res, &vec_aux_dir_res) {
(Ok(seq_aux_dir), Ok(vec_aux_dir)) => {
assert_eq!(
seq_aux_dir, vec_aux_dir,
"Mismatch for key {} - keyspace={:?} lsn={}",
key, keyspace, lsn
);
}
(Err(_), Err(_)) => {}
_ => {
panic!("Mismatch for {key}: {seq_aux_dir_res:?} != {vec_aux_dir_res:?}");
}
}
} else {
// All other keys should reconstruct deterministically, so we simply compare the blobs.
assert_eq!(
seq, vec,
"Image mismatch for key {key} - keyspace={keyspace:?} lsn={lsn}"
);
}
}

/// Get last or prev record separately. Same as get_last_record_rlsn().last/prev.
pub(crate) fn get_last_record_lsn(&self) -> Lsn {
self.last_record_lsn.load().last
Expand Down

1 comment on commit ae8468f

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2561 tests run: 2425 passed, 1 failed, 135 skipped (full report)


Failures on Postgres 14

  • test_basebackup_with_high_slru_count[github-actions-selfhosted-sequential-10-13-30]: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_basebackup_with_high_slru_count[release-pg14-github-actions-selfhosted-sequential-10-13-30]"
Flaky tests (2)

Postgres 16

  • test_compute_pageserver_connection_stress: release
  • test_vm_bit_clear_on_heap_lock: release

Code coverage* (full report)

  • functions: 28.8% (6962 of 24204 functions)
  • lines: 47.2% (42632 of 90238 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
ae8468f at 2024-03-05T14:45:22.650Z :recycle:

Please sign in to comment.