pageserver: handle WAL gaps on sharded tenants #6788

jcsp · 2024-02-16T14:36:27Z

Problem

In the test for #6776, a test cases uses tiny layer sizes and tiny stripe sizes. This hits a scenario where a shard's checkpoint interval spans a region where none of the content in the WAL is ingested by this shard. Since there is no layer to flush, we do not advance disk_consistent_lsn, and this causes the test to fail while waiting for LSN to advance.

Summary of changes

Pass an LSN through layer_flush_start_tx. This is the LSN to which we have frozen at the time we ask the flush to flush layers frozen up to this point.
In the layer flush task, if the layers we flush do not reach frozen_to_lsn, then advance disk_consistent_lsn up to this point.
In maybe_freeze_ephemeral_layer, handle the case where last_record_lsn has advanced without writing a layer file: this ensures that disk_consistent_lsn and remote_consistent_lsn advance anyway.

The net effect is that the disk_consistent_lsn is allowed to advance past regions in the WAL where a shard ingests no data, and that we uphold our guarantee that remote_consistent_lsn always eventually reaches the tip of the WAL.

The case of no layer at all is hard to test at present due to >0 shards being polluted with SLRU writes, but I have tested it locally with a branch that disables SLRU writes on shards >0. We can tighten up the testing on this in future as/when we refine shard filtering (currently shards >0 need the SLRU because they use it to figure out cutoff in GC using timestamp-to-lsn).

Checklist before requesting a review

I have performed a self-review of my code.
If it is a core feature, I have added thorough tests.
Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

Do not forget to reformat commit message to not include the above checklist

test_runner/regress/test_sharding.py

github-actions · 2024-02-16T15:18:03Z

2754 tests run: 2630 passed, 0 failed, 124 skipped (full report)

Flaky tests (4)

Postgres 16

test_compute_pageserver_connection_stress: release
test_null_config: release
test_deletion_queue_recovery[no-validate-lose]: debug
test_vm_bit_clear_on_heap_lock: debug

Code coverage* (full report)

functions: 28.0% (6396 of 22867 functions)
lines: 46.8% (45023 of 96104 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
4e3ea61 at 2024-04-04T17:07:46.062Z :recycle:}

jcsp · 2024-03-12T18:39:56Z

One of the reasons we didn't see this more widely is that nonzero shards still often do SLRU writes during ingest, even if they're not ingesting any of the data in a WAL record. This will probably be easier to test if I also make the change to only ingest SLRU writes on shard zero.

pageserver/src/tenant/timeline.rs

problame

To be honest, I don't fully trust myself around the layer freezing code, it's incredibly brittle.

In addition to my comments / questions, I would suggest getting another review from Vlad, he's been looking into this code path anyway.

pageserver/src/tenant/timeline.rs

jcsp · 2024-04-03T16:26:27Z

Changes in addition to review comments:

Making a stable test for this has highlighted the need to skip SLRU and checkpoint content on >0 shards, otherwise they're far too often getting extra writes that prevent is exploring the scenarios where a shard doesn't ingest anything within a region of the WAL.
Just advancing disk_consistent_lsn on layer flush wasn't sufficient: we also need to advance it when there is no layer to flush, and to advance remote_consistent_lsn as well. That is added in a second commit.

jcsp · 2024-04-04T13:50:12Z

I backed out the changes to ingest logic, and loosened the test slightly to tolerate that. The ingest changes were kind of ugly/fragile, and in any case it was incorrect to drop all SLRU content on shards >0, because they relied on it for GC bound calculation.

…consistent-lsn

jcsp added t/bug Issue Type: Bug c/storage/pageserver Component: storage: pageserver labels Feb 16, 2024

jcsp changed the title ~~Jcsp/sharding disk consistent lsn~~ pageserver: handle WAL gaps on sharded tenants Feb 16, 2024

jcsp mentioned this pull request Feb 16, 2024

pageserver: fix sharding emitting empty image layers during compaction #6776

Merged

5 tasks

koivunej reviewed Feb 16, 2024

View reviewed changes

test_runner/regress/test_sharding.py Outdated Show resolved Hide resolved

jcsp self-assigned this Feb 26, 2024

jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch from 65d6a7e to fe3d689 Compare March 1, 2024 16:45

jcsp marked this pull request as ready for review March 1, 2024 16:50

jcsp requested a review from a team as a code owner March 1, 2024 16:50

jcsp requested a review from arpad-m March 1, 2024 16:50

jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch from fe3d689 to 3a18b9d Compare March 1, 2024 16:53

jcsp requested a review from problame March 11, 2024 16:12

arpad-m approved these changes Mar 13, 2024

View reviewed changes

pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved

pageserver/src/tenant/timeline.rs Show resolved Hide resolved

problame reviewed Mar 15, 2024

View reviewed changes

pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved

pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved

pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved

jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch 2 times, most recently from 606df7f to 9f02912 Compare April 4, 2024 09:31

jcsp changed the title ~~pageserver: handle WAL gaps on sharded tenants~~ pageserver: more selective sharded ingest, handle WAL gaps on sharded tenants Apr 4, 2024

jcsp mentioned this pull request Apr 4, 2024

pageserver: filter initdb import by shard #7309

Closed

5 tasks

jcsp added 3 commits April 4, 2024 14:25

pageserver: advance disk_consistent_lsn past empty WAL ranges

9429307

pageserver: consistent LSN updates when there's no layer

9743b8a

tests: extended sharding ingest tests

6f8f021

jcsp changed the title ~~pageserver: more selective sharded ingest, handle WAL gaps on sharded tenants~~ pageserver: handle WAL gaps on sharded tenants Apr 4, 2024

jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch from 9f02912 to 6f8f021 Compare April 4, 2024 13:43

Merge remote-tracking branch 'upstream/main' into jcsp/sharding-disk-…

4e3ea61

…consistent-lsn

jcsp enabled auto-merge (squash) April 4, 2024 16:24

jcsp merged commit ac7fc61 into main Apr 4, 2024
46 of 47 checks passed

jcsp deleted the jcsp/sharding-disk-consistent-lsn branch April 4, 2024 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pageserver: handle WAL gaps on sharded tenants #6788

pageserver: handle WAL gaps on sharded tenants #6788

jcsp commented Feb 16, 2024 •

edited

Loading

github-actions bot commented Feb 16, 2024 •

edited

Loading

Postgres 16

jcsp commented Mar 12, 2024

problame left a comment

jcsp commented Apr 3, 2024

jcsp commented Apr 4, 2024

pageserver: handle WAL gaps on sharded tenants #6788

pageserver: handle WAL gaps on sharded tenants #6788

Conversation

jcsp commented Feb 16, 2024 • edited Loading

Problem

Summary of changes

Checklist before requesting a review

Checklist before merging

github-actions bot commented Feb 16, 2024 • edited Loading

2754 tests run: 2630 passed, 0 failed, 124 skipped (full report)

Postgres 16

Code coverage* (full report)

jcsp commented Mar 12, 2024

problame left a comment

Choose a reason for hiding this comment

jcsp commented Apr 3, 2024

jcsp commented Apr 4, 2024

jcsp commented Feb 16, 2024 •

edited

Loading

github-actions bot commented Feb 16, 2024 •

edited

Loading