Skip to content

Commit

Permalink
fix(Layer): carry gate until eviction is complete (#7838)
Browse files Browse the repository at this point in the history
the gate was accidentially being dropped before the final blocking
phase, possibly explaining the resident physical size global problems
during deletions.

it could had caused more harm as well, but the path is not actively
being tested because cplane no longer puts locationconfigs with higher
generation number during normal operation which prompted the last wave
of fixes.

Cc: #7341.
  • Loading branch information
koivunej authored May 22, 2024
1 parent e015b2b commit 62aac6c
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions pageserver/src/tenant/storage_layer/layer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ use std::time::{Duration, SystemTime};
use tracing::Instrument;
use utils::id::TimelineId;
use utils::lsn::Lsn;
use utils::sync::heavier_once_cell;
use utils::sync::{gate, heavier_once_cell};

use crate::config::PageServerConf;
use crate::context::{DownloadBehavior, RequestContext};
Expand Down Expand Up @@ -1333,7 +1333,7 @@ impl LayerInner {

is_good_to_continue(&rx.borrow_and_update())?;

let Ok(_gate) = timeline.gate.enter() else {
let Ok(gate) = timeline.gate.enter() else {
return Err(EvictionCancelled::TimelineGone);
};

Expand Down Expand Up @@ -1421,7 +1421,7 @@ impl LayerInner {
Self::spawn_blocking(move || {
let _span = span.entered();

let res = self.evict_blocking(&timeline, &permit);
let res = self.evict_blocking(&timeline, &gate, &permit);

let waiters = self.inner.initializer_count();

Expand All @@ -1447,6 +1447,7 @@ impl LayerInner {
fn evict_blocking(
&self,
timeline: &Timeline,
_gate: &gate::GateGuard,
_permit: &heavier_once_cell::InitPermit,
) -> Result<(), EvictionCancelled> {
// now accesses to `self.inner.get_or_init*` wait on the semaphore or the `_permit`
Expand Down

1 comment on commit 62aac6c

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3190 tests run: 3050 passed, 0 failed, 140 skipped (full report)


Flaky tests (1)

Postgres 14

  • test_vm_bit_clear_on_heap_lock: debug

Code coverage* (full report)

  • functions: 31.3% (6415 of 20477 functions)
  • lines: 48.1% (49325 of 102645 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
62aac6c at 2024-05-22T16:39:33.769Z :recycle:

Please sign in to comment.