Skip to content

Commit

Permalink
pageserver: add more info-level logging in shard splits (#8137)
Browse files Browse the repository at this point in the history
## Problem

`test_sharding_autosplit` is occasionally failing on warnings about
shard splits taking longer than expected (`Exclusive lock by ShardSplit
was held for`...)

It's not obvious which part is taking the time (I suspect remote storage
uploads).

Example:
https://neon-github-public-dev.s3.amazonaws.com/reports/main/9618788427/index.html#testresult/b395294d5bdeb783/

## Summary of changes

- Since shard splits are infrequent events, we can afford to be very
chatty: add a bunch of info-level logging throughout the process.
  • Loading branch information
jcsp committed Jun 24, 2024
1 parent 188797f commit de05f90
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 0 deletions.
4 changes: 4 additions & 0 deletions pageserver/src/tenant.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2151,19 +2151,22 @@ impl Tenant {
// Upload an index from the parent: this is partly to provide freshness for the
// child tenants that will copy it, and partly for general ease-of-debugging: there will
// always be a parent shard index in the same generation as we wrote the child shard index.
tracing::info!(timeline_id=%timeline.timeline_id, "Uploading index");
timeline
.remote_client
.schedule_index_upload_for_file_changes()?;
timeline.remote_client.wait_completion().await?;

// Shut down the timeline's remote client: this means that the indices we write
// for child shards will not be invalidated by the parent shard deleting layers.
tracing::info!(timeline_id=%timeline.timeline_id, "Shutting down remote storage client");
timeline.remote_client.shutdown().await;

// Download methods can still be used after shutdown, as they don't flow through the remote client's
// queue. In principal the RemoteTimelineClient could provide this without downloading it, but this
// operation is rare, so it's simpler to just download it (and robustly guarantees that the index
// we use here really is the remotely persistent one).
tracing::info!(timeline_id=%timeline.timeline_id, "Downloading index_part from parent");
let result = timeline.remote_client
.download_index_file(&self.cancel)
.instrument(info_span!("download_index_file", tenant_id=%self.tenant_shard_id.tenant_id, shard_id=%self.tenant_shard_id.shard_slug(), timeline_id=%timeline.timeline_id))
Expand All @@ -2176,6 +2179,7 @@ impl Tenant {
};

for child_shard in child_shards {
tracing::info!(timeline_id=%timeline.timeline_id, "Uploading index_part for child {}", child_shard.to_index());
upload_index_part(
&self.remote_storage,
child_shard,
Expand Down
12 changes: 12 additions & 0 deletions pageserver/src/tenant/mgr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1715,6 +1715,7 @@ impl TenantManager {
let timelines = parent_shard.timelines.lock().unwrap().clone();
let parent_timelines = timelines.keys().cloned().collect::<Vec<_>>();
for timeline in timelines.values() {
tracing::info!(timeline_id=%timeline.timeline_id, "Loading list of layers to hardlink");
let timeline_layers = timeline
.layers
.read()
Expand Down Expand Up @@ -1754,7 +1755,12 @@ impl TenantManager {

// Since we will do a large number of small filesystem metadata operations, batch them into
// spawn_blocking calls rather than doing each one as a tokio::fs round-trip.
let span = tracing::Span::current();
let jh = tokio::task::spawn_blocking(move || -> anyhow::Result<usize> {
// Run this synchronous code in the same log context as the outer function that spawned it.
let _span = span.enter();

tracing::info!("Creating {} directories", create_dirs.len());
for dir in &create_dirs {
if let Err(e) = std::fs::create_dir_all(dir) {
// Ignore AlreadyExists errors, drop out on all other errors
Expand All @@ -1768,6 +1774,11 @@ impl TenantManager {
}

for child_prefix in child_prefixes {
tracing::info!(
"Hard-linking {} parent layers into child path {}",
parent_layers.len(),
child_prefix
);
for relative_layer in &parent_layers {
let parent_path = parent_path.join(relative_layer);
let child_path = child_prefix.join(relative_layer);
Expand All @@ -1793,6 +1804,7 @@ impl TenantManager {
// Durability is not required for correctness, but if we crashed during split and
// then came restarted with empty timeline dirs, it would be very inefficient to
// re-populate from remote storage.
tracing::info!("fsyncing {} directories", create_dirs.len());
for dir in create_dirs {
if let Err(e) = crashsafe::fsync(&dir) {
// Something removed a newly created timeline dir out from underneath us? Extremely
Expand Down

1 comment on commit de05f90

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3004 tests run: 2877 passed, 1 failed, 126 skipped (full report)


Failures on Postgres 14

  • test_storage_controller_many_tenants[github-actions-selfhosted]: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_storage_controller_many_tenants[release-pg14-github-actions-selfhosted]"
Flaky tests (1)

Postgres 14

Code coverage* (full report)

  • functions: 32.5% (6871 of 21131 functions)
  • lines: 50.0% (53396 of 106771 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
de05f90 at 2024-06-24T12:27:29.893Z :recycle:

Please sign in to comment.