Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: handle WAL gaps on sharded tenants #6788

Merged
merged 4 commits into from
Apr 4, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Feb 16, 2024

Problem

In the test for #6776, a test cases uses tiny layer sizes and tiny stripe sizes. This hits a scenario where a shard's checkpoint interval spans a region where none of the content in the WAL is ingested by this shard. Since there is no layer to flush, we do not advance disk_consistent_lsn, and this causes the test to fail while waiting for LSN to advance.

Summary of changes

  • Pass an LSN through layer_flush_start_tx. This is the LSN to which we have frozen at the time we ask the flush to flush layers frozen up to this point.
  • In the layer flush task, if the layers we flush do not reach frozen_to_lsn, then advance disk_consistent_lsn up to this point.
  • In maybe_freeze_ephemeral_layer, handle the case where last_record_lsn has advanced without writing a layer file: this ensures that disk_consistent_lsn and remote_consistent_lsn advance anyway.

The net effect is that the disk_consistent_lsn is allowed to advance past regions in the WAL where a shard ingests no data, and that we uphold our guarantee that remote_consistent_lsn always eventually reaches the tip of the WAL.

The case of no layer at all is hard to test at present due to >0 shards being polluted with SLRU writes, but I have tested it locally with a branch that disables SLRU writes on shards >0. We can tighten up the testing on this in future as/when we refine shard filtering (currently shards >0 need the SLRU because they use it to figure out cutoff in GC using timestamp-to-lsn).

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added t/bug Issue Type: Bug c/storage/pageserver Component: storage: pageserver labels Feb 16, 2024
@jcsp jcsp changed the title Jcsp/sharding disk consistent lsn pageserver: handle WAL gaps on sharded tenants Feb 16, 2024
Copy link

github-actions bot commented Feb 16, 2024

2754 tests run: 2630 passed, 0 failed, 124 skipped (full report)


Flaky tests (4)

Postgres 16

  • test_compute_pageserver_connection_stress: release
  • test_null_config: release
  • test_deletion_queue_recovery[no-validate-lose]: debug
  • test_vm_bit_clear_on_heap_lock: debug

Code coverage* (full report)

  • functions: 28.0% (6396 of 22867 functions)
  • lines: 46.8% (45023 of 96104 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
4e3ea61 at 2024-04-04T17:07:46.062Z :recycle:

@jcsp jcsp self-assigned this Feb 26, 2024
@jcsp jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch from 65d6a7e to fe3d689 Compare March 1, 2024 16:45
@jcsp jcsp marked this pull request as ready for review March 1, 2024 16:50
@jcsp jcsp requested a review from a team as a code owner March 1, 2024 16:50
@jcsp jcsp requested a review from arpad-m March 1, 2024 16:50
@jcsp jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch from fe3d689 to 3a18b9d Compare March 1, 2024 16:53
@jcsp jcsp requested a review from problame March 11, 2024 16:12
@jcsp
Copy link
Collaborator Author

jcsp commented Mar 12, 2024

One of the reasons we didn't see this more widely is that nonzero shards still often do SLRU writes during ingest, even if they're not ingesting any of the data in a WAL record. This will probably be easier to test if I also make the change to only ingest SLRU writes on shard zero.

pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline.rs Show resolved Hide resolved
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't fully trust myself around the layer freezing code, it's incredibly brittle.

In addition to my comments / questions, I would suggest getting another review from Vlad, he's been looking into this code path anyway.

pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline.rs Outdated Show resolved Hide resolved
@jcsp
Copy link
Collaborator Author

jcsp commented Apr 3, 2024

Changes in addition to review comments:

  • Making a stable test for this has highlighted the need to skip SLRU and checkpoint content on >0 shards, otherwise they're far too often getting extra writes that prevent is exploring the scenarios where a shard doesn't ingest anything within a region of the WAL.
  • Just advancing disk_consistent_lsn on layer flush wasn't sufficient: we also need to advance it when there is no layer to flush, and to advance remote_consistent_lsn as well. That is added in a second commit.

@jcsp jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch 2 times, most recently from 606df7f to 9f02912 Compare April 4, 2024 09:31
@jcsp jcsp changed the title pageserver: handle WAL gaps on sharded tenants pageserver: more selective sharded ingest, handle WAL gaps on sharded tenants Apr 4, 2024
@jcsp jcsp changed the title pageserver: more selective sharded ingest, handle WAL gaps on sharded tenants pageserver: handle WAL gaps on sharded tenants Apr 4, 2024
@jcsp jcsp force-pushed the jcsp/sharding-disk-consistent-lsn branch from 9f02912 to 6f8f021 Compare April 4, 2024 13:43
@jcsp
Copy link
Collaborator Author

jcsp commented Apr 4, 2024

I backed out the changes to ingest logic, and loosened the test slightly to tolerate that. The ingest changes were kind of ugly/fragile, and in any case it was incorrect to drop all SLRU content on shards >0, because they relied on it for GC bound calculation.

@jcsp jcsp enabled auto-merge (squash) April 4, 2024 16:24
@jcsp jcsp merged commit ac7fc61 into main Apr 4, 2024
46 of 47 checks passed
@jcsp jcsp deleted the jcsp/sharding-disk-consistent-lsn branch April 4, 2024 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants