-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: location configuration API, attachment modes, secondary locations #5299
Conversation
ca2270d
to
9b6bda9
Compare
2238 tests run: 2122 passed, 0 failed, 116 skipped (full report)Code coverage (full report)
The comment gets automatically updated with the latest test results
3b1b1d8 at 2023-10-04T17:11:12.357Z :recycle: |
a435997
to
665647c
Compare
665647c
to
50bdc2e
Compare
50bdc2e
to
9239e92
Compare
9239e92
to
1c5e6cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't review mgr.rs nor config.rs yet. Will do that tomorrow. In the meantime, left you comment #5299 (comment) to worry about / figure out what to do.
Co-authored-by: Christian Schwarz <christian@neon.tech>
Co-authored-by: Christian Schwarz <christian@neon.tech>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the two comments I added, I approve these changes here.
Not sure if we should merge them before the existing generation numbers support has gotten exposure to prod.
…e map (#5367) ## Problem Follows on from #5299 - We didn't have a generic way to protect a tenant undergoing changes: `Tenant` had states, but for our arbitrary transitions between secondary/attached, we need a general way to say "reserve this tenant ID, and don't allow any other ops on it, but don't try and report it as being in any particular state". - The TenantsMap structure was behind an async RwLock, but it was never correct to hold it across await points: that would block any other changes for all tenants. ## Summary of changes - Add the `TenantSlot::InProgress` value. This means: - Incoming administrative operations on the tenant should retry later - Anything trying to read the live state of the tenant (e.g. a page service reader) should retry later or block. - Store TenantsMap in `std::sync::RwLock` - Provide an extended `get_active_tenant_with_timeout` for page_service to use, which will wait on InProgress slots as well as non-active tenants. Closes: #5378 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>
…ration none/broken usages (#7947) ## Problem In #5299, the new config-v1 tenant config file was added to hold the LocationConf type. We left the old config file in place for forward compat, and because running without generations (therefore without LocationConf) as still useful before the storage controller was ready for prime-time. Closes: #5388 ## Summary of changes - Remove code for reading and writing the legacy config file - Remove Generation::Broken: it was unused. - Treat missing config file on disk as an error loading a tenant, rather than defaulting it. We can now remove LocationConf::default, and thereby guarantee that we never construct a tenant with a None generation. - Update some comments + add some assertions to clarify that Generation::None is only used in layer metadata, not in the state of a running tenant. - Update docker compose test to create tenants with a generation
…ration none/broken usages (#7947) ## Problem In #5299, the new config-v1 tenant config file was added to hold the LocationConf type. We left the old config file in place for forward compat, and because running without generations (therefore without LocationConf) as still useful before the storage controller was ready for prime-time. Closes: #5388 ## Summary of changes - Remove code for reading and writing the legacy config file - Remove Generation::Broken: it was unused. - Treat missing config file on disk as an error loading a tenant, rather than defaulting it. We can now remove LocationConf::default, and thereby guarantee that we never construct a tenant with a None generation. - Update some comments + add some assertions to clarify that Generation::None is only used in layer metadata, not in the state of a running tenant. - Update docker compose test to create tenants with a generation
Problem
These changes are part of building seamless tenant migration, as described in the RFC:
Summary of changes
LocationConf
supersedesTenantConfOpt
for storing a tenant's configuration in the pageserver repo dir. It containsTenantConfOpt
, as well as a newmode
attribute that describes what kind of location this is (secondary, attached, attachment mode etc). It is written to a file calledconfig-v1
instead ofconfig
-- this prepares us for neatly making any other profound changes to the format of the file in future. Forward compat for existing pageserver code is achieved by writing out both old and new style files. Backward compat is achieved by checking for the old-style file if the new one isn't found.TenantMap
type changes, to holdTenantSlot
instead of justTenant
. TheTenant
type continues to be used for attached tenants only. Tenants in other states (such as secondaries) are represented by a different variant ofTenantSlot
.Tenant
&Timeline
used to hold an Arc<Mutex>, they now hold a reference to a AttachedTenantConf, which includes the extra information from LocationConf. This enables them to know the current attachment mode.PUT /tenants/<tenant_id>/location_config
to drive new location configuration. This provides a superset of the functionality of attach/detach/load/ignore:Caveats:
location_config
API are not handled elegantly in this PR, a better mechanism is added in the follow on pageserver: addInProgress
tenant map state, use a sync lock for the map #5367Closes: #5379
Checklist before requesting a review
Checklist before merging