Skip to content

Commit

Permalink
pageserver: fix division by zero in layer counting metric (#7662)
Browse files Browse the repository at this point in the history
For aux file keys (v1 or v2) the vectored read path does not return an
error when they're missing. Instead they are omitted from the resulting
btree (this is a requirement, not a bug). Skip updating the metric in
these cases to avoid infinite results.
  • Loading branch information
VladLazar committed May 8, 2024
1 parent b06eec4 commit d5399b7
Showing 1 changed file with 11 additions and 5 deletions.
16 changes: 11 additions & 5 deletions pageserver/src/tenant/timeline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1220,11 +1220,17 @@ impl Timeline {
}
reconstruct_timer.stop_and_record();

// Note that this is an approximation. Tracking the exact number of layers visited
// per key requires virtually unbounded memory usage and is inefficient
// (i.e. segment tree tracking each range queried from a layer)
crate::metrics::VEC_READ_NUM_LAYERS_VISITED
.observe(layers_visited as f64 / results.len() as f64);
// For aux file keys (v1 or v2) the vectored read path does not return an error
// when they're missing. Instead they are omitted from the resulting btree
// (this is a requirement, not a bug). Skip updating the metric in these cases
// to avoid infinite results.
if !results.is_empty() {
// Note that this is an approximation. Tracking the exact number of layers visited
// per key requires virtually unbounded memory usage and is inefficient
// (i.e. segment tree tracking each range queried from a layer)
crate::metrics::VEC_READ_NUM_LAYERS_VISITED
.observe(layers_visited as f64 / results.len() as f64);
}

Ok(results)
}
Expand Down

1 comment on commit d5399b7

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3105 tests run: 2957 passed, 2 failed, 146 skipped (full report)


Failures on Postgres 14

  • test_storage_controller_many_tenants[github-actions-selfhosted]: release
  • test_parallel_copy_different_tables[neon-github-actions-selfhosted]: release
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_storage_controller_many_tenants[release-pg14-github-actions-selfhosted] or test_parallel_copy_different_tables[neon-release-pg14-github-actions-selfhosted]"
Flaky tests (2)

Postgres 16

  • test_vm_bit_clear_on_heap_lock: debug

Postgres 14

  • test_gc_aggressive: debug

Code coverage* (full report)

  • functions: 31.4% (6313 of 20124 functions)
  • lines: 47.3% (47578 of 100662 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
d5399b7 at 2024-05-08T20:11:39.530Z :recycle:

Please sign in to comment.