Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: include generation number in local layer paths #7609

Merged
merged 20 commits into from
May 7, 2024

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented May 3, 2024

Problem

In #7531, we would like to be able to rewrite layers safely. One option is to make Layer able to rewrite files in place safely (e.g. by blocking evictions/deletions for an old Layer while a new one is created), but that's relatively fragile. It's more robust in general if we simply never overwrite the same local file: we can do that by putting the generation number in the filename.

Summary of changes

  • Add local_layer_path (counterpart to remote_layer_path) and convert all locations that manually constructed a local layer path by joining LayerFileName to timeline path
  • In the layer upload path, construct remote paths with remote_layer_path rather than trying to build them out of a local path.
  • During startup, carry the full path to layer files through init::reconcile, and pass it into Layer::for_resident
  • Add a test to make sure we handle upgrades properly.
  • Comment out the generation part of local_layer_path, since we need to maintain forward compatibility for one release. A tiny followup PR will enable it afterwards.

We could make this a bit simpler if we bulk renamed existing layers on startup instead of carrying literal paths through init, but that is operationally risky on existing servers with millions of layer files. We can always do a renaming change in future if it becomes annoying, but for the moment it's kind of nice to have a structure that enables us to change local path names again in future quite easily.

We should rename LayerFileName to LayerName or somesuch, to make it more obvious that it's not a literal filename: this was already a bit confusing where that type is used in remote paths. That will be a followup, to avoid polluting this PR's diff.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added t/feature Issue type: feature, for new features or requests c/storage/pageserver Component: storage: pageserver labels May 3, 2024
Copy link

github-actions bot commented May 3, 2024

2976 tests run: 2843 passed, 0 failed, 133 skipped (full report)


Flaky tests (3)

Postgres 16

  • test_compute_pageserver_connection_stress: debug
  • test_vm_bit_clear_on_heap_lock: debug

Postgres 15

  • test_sharding_split_failures[failure1]: debug

Code coverage* (full report)

  • functions: 31.3% (6248 of 19986 functions)
  • lines: 46.7% (46863 of 100252 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
77441a2 at 2024-05-07T17:00:20.584Z :recycle:

@jcsp jcsp force-pushed the jcsp/issue-7504-local-paths branch from 4377c71 to 9bb1baf Compare May 6, 2024 13:04
@jcsp jcsp force-pushed the jcsp/issue-7504-local-paths branch from 9bb1baf to d9b0b13 Compare May 6, 2024 14:50
@jcsp jcsp marked this pull request as ready for review May 6, 2024 15:34
@jcsp jcsp requested a review from a team as a code owner May 6, 2024 15:34
@jcsp jcsp requested review from petuhovskiy and problame May 6, 2024 15:34
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if FromStr discarding the generation number suffix is particularly idiomatic, but, seems like the path of least resistance.


How did you discover all the places that needed to switch over to local_layer_path? (Asking to judge how diligently I need to re-scan the parts of the code base that have not been changed by this PR)


I think this change is sufficient for #7531, but I think we should take a step back and see whether a more general approach will serve us better.

I'm thinking of image layer compression, and also potential future changes like adding bloom filters to layer files.

For these use cases, we'll have to rewrite a layer file within the same generation. Your #7531 can get away with relying on the fact that post-split, it's always a different generation than pre-split. But, that's a special case.

I feel like the code base is missing the distinction between physical and logical identity of a layer.

  • Logical identity is "what data is in there", and it's determined completely by (tenant_id,timeline_id,key range, lsn range).
  • Physical identity is "logical identity + an exact bit pattern".

For compression & bloom filters, we'd be creating a new physical identity.

The post-split rewrites in #7531, it's a weird mix of logical & physical identity change. But, if you tweak the definitions a little, it's also just a physical change.

  • Logical identity is "what data is AT LEAST in there", and it's determined completely by (tenant_id,timeline_id,key range, lsn range).
  • Physical identity is "logical identity + an exact bit pattern".

The point I'm trying to make is that #7531 is effectively just a very fancy compression mechanism that knows it can throw away data because it's never going to be read in the shard split child.

I think to solve this cleanly, we need another disambiguator in layer file local name & remote storage key. That disambiguator would identify the physical bit pattern. A cryptographically strong checksum would be suitable for this.

pageserver/src/tenant/remote_timeline_client/download.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline/init.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline/init.rs Show resolved Hide resolved
pageserver/src/tenant/remote_timeline_client.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a call with John.

Notes:

  • General agreement about physical vs logical identity

  • On a clean slate, we'd likely introduce hashes as a physical disambiguator before implementing any rewriting.

  • But, we're under some time pressure.

  • Generation numbers are actually not bumped during split => would have to include the shard identity in the local file name to physically disambiguate

  • NB: in remote storage, the shard identity is included in the key => in remote storage, we have the physical disambiguation

  • Introducing writing (not checking) of the hashing would still mean we'd have to pick a hashing function and perf-test it.

Decision:

@problame
Copy link
Contributor

problame commented May 7, 2024

I gave approval but at least #7609 (comment) needs changes, I think.

Copy link
Member

@petuhovskiy petuhovskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think I know enough to comment on the code, will just ask general questions instead:

  1. This PR adds support for optional -<gen> suffix for layer files, but we don’t want to actually create files with this suffix until next PR, right?
  2. -<gen> suffix is different from LSN by length (8 vs 16) and using lowercase hex (instead of uppercase in LSN)? Need to be extra careful with $ in the end of regex?
  3. Are we going to add this suffix in remote storage as well?
  4. What happens in case of unusual setups, e.g. one shard on ps1 running newer version and writes -<gen> suffix and another shard on ps2 doesn’t? Or what if we rollback pageserver after it started writing new suffix, and then roll forward? Everything should work, right?
  5. Do we have any other tools assuming name of the layers with no suffix? s3_scrubber?

@jcsp
Copy link
Contributor Author

jcsp commented May 7, 2024

This PR adds support for optional - suffix for layer files, but we don’t want to actually create files with this suffix until next PR, right?

Right.

- suffix is different from LSN by length (8 vs 16) and using lowercase hex (instead of uppercase in LSN)? Need to be extra careful with $ in the end of regex?

Right. Checking the length/case of the hex regions is important, otherwise an image layer with a generation suffix could parse as an image layer (although this problem goes away if we add the -v1- part that christian suggestions, which I intend to do)

Are we going to add this suffix in remote storage as well?

Remote storage paths already include the ShardIndex (since they are prefaced with a TenantShardId) and Generation suffix.

What happens in case of unusual setups, e.g. one shard on ps1 running newer version and writes - suffix and another shard on ps2 doesn’t?

Since this is only changing local paths to layers, that would have no impact: each pageserver only sees its own local paths, and the remote S3 path is unchanged.

Or what if we rollback pageserver after it started writing new suffix, and then roll forward? Everything should work, right?

Right. Since we will delay merging the change to write the new suffix, we would be rolling back to code (this PR) which already knows how to read it.

Do we have any other tools assuming name of the layers with no suffix? s3_scrubber?

Scrubber only cares about remote paths. Not sure if pagectl cares about layer file names -- it's in a weird unmaintained state currently, no tests exist & I don't think we've used it for quite a while.

@jcsp
Copy link
Contributor Author

jcsp commented May 7, 2024

Picking up followups in #7640 (will merge that before we do next release)

@jcsp jcsp enabled auto-merge (squash) May 7, 2024 17:02
@jcsp jcsp merged commit 0af66a6 into main May 7, 2024
52 of 53 checks passed
@jcsp jcsp deleted the jcsp/issue-7504-local-paths branch May 7, 2024 17:03
conradludgate pushed a commit that referenced this pull request May 8, 2024
## Problem

In #7531, we would like to be
able to rewrite layers safely. One option is to make `Layer` able to
rewrite files in place safely (e.g. by blocking evictions/deletions for
an old Layer while a new one is created), but that's relatively fragile.
It's more robust in general if we simply never overwrite the same local
file: we can do that by putting the generation number in the filename.

## Summary of changes

- Add `local_layer_path` (counterpart to `remote_layer_path`) and
convert all locations that manually constructed a local layer path by
joining LayerFileName to timeline path
- In the layer upload path, construct remote paths with
`remote_layer_path` rather than trying to build them out of a local
path.
- During startup, carry the full path to layer files through
`init::reconcile`, and pass it into `Layer::for_resident`
- Add a test to make sure we handle upgrades properly.
- Comment out the generation part of `local_layer_path`, since we need
to maintain forward compatibility for one release. A tiny followup PR
will enable it afterwards.

We could make this a bit simpler if we bulk renamed existing layers on
startup instead of carrying literal paths through init, but that is
operationally risky on existing servers with millions of layer files. We
can always do a renaming change in future if it becomes annoying, but
for the moment it's kind of nice to have a structure that enables us to
change local path names again in future quite easily.

We should rename `LayerFileName` to `LayerName` or somesuch, to make it
more obvious that it's not a literal filename: this was already a bit
confusing where that type is used in remote paths. That will be a
followup, to avoid polluting this PR's diff.
jcsp added a commit that referenced this pull request May 14, 2024
We recently added support for local layer paths that contain a
generation number:
- #7609
- #7640

Now that we've cut a
[release](#7735) that includes
those changes, we can proceed to enable writing the new format without
breaking forward compatibility.
a-masterov pushed a commit that referenced this pull request May 20, 2024
We recently added support for local layer paths that contain a
generation number:
- #7609
- #7640

Now that we've cut a
[release](#7735) that includes
those changes, we can proceed to enable writing the new format without
breaking forward compatibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver t/feature Issue type: feature, for new features or requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants