Skip to content

Commit

Permalink
pageserver: fix a warning on secondary mode downloads after evictions (
Browse files Browse the repository at this point in the history
…#7877)

## Problem

In 4ce6e2d we added a warning when progress stats don't look right at
the end of a secondary download pass.

This `Correcting drift in progress stats` warning fired in staging on a
pageserver that had been doing some disk usage eviction.

The impact is low because in the same place we log the warning, we also
fix up the progress values.

## Summary of changes

- When we skip downloading a layer because it was recently evicted,
update the progress stats to ensure they still reach a clean complete
state at the end of a download pass.
- Also add a log for evicting secondary location layers, for symmetry
with attached locations, so that we can clearly see when eviction has
happened for a particular tenant's layers when investigating issues.

This is a point fix -- the code would also benefit from being refactored
so that there is some "download result" type with a Skip variant, to
ensure that we are updating the progress stats uniformly for those
cases.
  • Loading branch information
jcsp authored May 28, 2024
1 parent f9f69a2 commit 352b08d
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 7 deletions.
1 change: 1 addition & 0 deletions pageserver/src/tenant/secondary.rs
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@ impl SecondaryTenant {
};

let now = SystemTime::now();
tracing::info!("Evicting secondary layer");

let this = self.clone();

Expand Down
18 changes: 11 additions & 7 deletions pageserver/src/tenant/secondary/downloader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -909,6 +909,7 @@ impl<'a> TenantDownloader<'a> {
strftime(&layer.access_time),
strftime(evicted_at)
);
self.skip_layer(layer);
continue;
}
}
Expand Down Expand Up @@ -963,6 +964,15 @@ impl<'a> TenantDownloader<'a> {
Ok(())
}

/// Call this during timeline download if a layer will _not_ be downloaded, to update progress statistics
fn skip_layer(&self, layer: HeatMapLayer) {
let mut progress = self.secondary_state.progress.lock().unwrap();
progress.layers_total = progress.layers_total.saturating_sub(1);
progress.bytes_total = progress
.bytes_total
.saturating_sub(layer.metadata.file_size);
}

async fn download_layer(
&self,
tenant_shard_id: &TenantShardId,
Expand Down Expand Up @@ -1012,13 +1022,7 @@ impl<'a> TenantDownloader<'a> {
"Skipped downloading missing layer {}, raced with compaction/gc?",
layer.name
);

// If the layer is 404, adjust the progress statistics to reflect that we will not download it.
let mut progress = self.secondary_state.progress.lock().unwrap();
progress.layers_total = progress.layers_total.saturating_sub(1);
progress.bytes_total = progress
.bytes_total
.saturating_sub(layer.metadata.file_size);
self.skip_layer(layer);

return Ok(None);
}
Expand Down

1 comment on commit 352b08d

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3232 tests run: 3086 passed, 0 failed, 146 skipped (full report)


Flaky tests (1)

Postgres 16

  • test_vm_bit_clear_on_heap_lock: debug

Code coverage* (full report)

  • functions: 31.4% (6448 of 20544 functions)
  • lines: 48.3% (49954 of 103359 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
352b08d at 2024-05-28T16:26:23.063Z :recycle:

Please sign in to comment.