Release 2024-03-27

vipvap · 2024-03-27T18:16:37Z

Please merge this Pull Request using 'Create a merge commit' button

## Problem We currently hold the layer map read lock while doing IO on the read path. This is not required for correctness. ## Summary of changes Drop the layer map lock after figuring out which layer we wish to read from. Why is this correct: * `Layer` models the lifecycle of an on disk layer. In the event the layer is removed from local disk, it will be on demand downloaded * `InMemoryLayer` holds the `EphemeralFile` which wraps the on disk file. As long as the `InMemoryLayer` is in scope, it's safe to read from it. Related #6833

## Problem `test_bulk_insert` becomes too slow, and it fails constantly: #7124 ## Summary of changes - Skip `test_bulk_insert` until it's fixed

## Problem Currently, we return 409 (Conflict) in two cases: - Temporary: Timeline creation cannot proceed because another timeline with the same ID is being created - Permanent: Timeline creation cannot proceed because another timeline exists with different parameters but the same ID. Callers which time out a request and retry should be able to distinguish these cases. Closes: #7208 ## Summary of changes - Expose `AlreadyCreating` errors as 429 instead of 409

## Problem Follows: #7182 - Sufficient concurrent writes could OOM a pageserver from the size of indices on all the InMemoryLayer instances. - Enforcement of checkpoint_period only happened if there were some writes. Closes: #6916 ## Summary of changes - Add `ephemeral_bytes_per_memory_kb` config property. This controls the ratio of ephemeral layer capacity to memory capacity. The weird unit is to enable making the ratio less than 1:1 (set this property to 1024 to use 1MB of ephemeral layers for every 1MB of RAM, set it smaller to get a fraction). - Implement background layer rolling checks in Timeline::compaction_iteration -- this ensures we apply layer rolling policy in the absence of writes. - During background checks, if the total ephemeral layer size has exceeded the limit, then roll layers whose size is greater than the mean size of all ephemeral layers. - Remove the tick() path from walreceiver: it isn't needed any more now that we do equivalent checks from compaction_iteration. - Add tests for the above. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>

## Problem - Creations were not idempotent (unique key violation) - Creations waited for reconciliation, which control plane blocks while an operation is in flight ## Summary of changes - Handle unique key constraint violation as an OK situation: if we're creating the same tenant ID and shard count, it's reasonable to assume this is a duplicate creation. - Make the wait for reconcile during creation tolerate failures: this is similar to location_conf, where the cloud control plane blocks our notification calls until it is done with calling into our API (in future this constraint is expected to relax as the cloud control plane learns to run multiple operations concurrently for a tenant)

## Problem #7227 destabilized various tests in the performance suite, with log errors during shutdown. It's because we switched shutdown order to stop the storage controller before the pageservers. ## Summary of changes - Tolerate "connection failed" errors from pageservers trying to validation their deletion queue.

## Problem This is a refactor. This PR was a precursor to a much smaller change e5bd602, where as I was writing it I found that we were not far from getting rid of the last non-deprecated code paths that use `mgr::` scoped functions to get at the TenantManager state. We're almost done cleaning this up as per #5796. The only significant remaining mgr:: item is `get_active_tenant_with_timeout`, which is page_service's path for fetching tenants. ## Summary of changes - Remove the bool argument to get_attached_tenant_shard: this was almost always false from API use cases, and in cases when it was true, it was readily replacable with an explicit check of the returned tenant's status. - Rather than letting the timeline eviction task query any tenant it likes via `mgr::`, pass an `Arc<Tenant>` into the task. This is still an ugly circular reference, but should eventually go away: either when we switch to exclusively using disk usage eviction, or when we change metadata storage to avoid the need to imitate layer accesses. - Convert all the mgr::get_tenant call sites to use TenantManager::get_attached_tenant_shard - Move list_tenants into TenantManager.

## Problem neondatabase/cloud#9642 ## Summary of changes 1. Make `EndpointRateLimiter` generic, renamed as `BucketRateLimiter` 2. Add support for claiming multiple tokens at once 3. Add `AuthRateLimiter` alias. 4. Check `(Endpoint, IP)` pair during authentication, weighted by how many hashes proxy would be doing. TODO: handle ipv6 subnets. will do this in a separate PR.

…eceiver_connection tokio task (#7235) # Problem As pointed out through doc-comments in this PR, `drop_old_connection` is not cancellation-safe. This means we can leave a `handle_walreceiver_connection` tokio task dangling during Timeline shutdown. More details described in the corresponding issue #7062. # Solution Don't cancel-by-drop the `connection_manager_loop_step` from the `tokio::select!()` in the task_mgr task. Instead, transform the code to use a `CancellationToken` --- specifically, `task_mgr::shutdown_token()` --- and make code responsive to it. The `drop_old_connection()` is still not cancellation-safe and also doesn't get a cancellation token, because there's no point inside the function where we could return early if cancellation were requested using a token. We rely on the `handle_walreceiver_connection` to be sensitive to the `TaskHandle`s cancellation token (argument name: `cancellation`). Currently it checks for `cancellation` on each WAL message. It is probably also sensitive to `Timeline::cancel` because ultimately all that `handle_walreceiver_connection` does is interact with the `Timeline`. In summary, the above means that the following code (which is found in `Timeline::shutdown`) now might **take longer**, but actually ensures that all `handle_walreceiver_connection` tasks are finished: ```rust task_mgr::shutdown_tasks( Some(TaskKind::WalReceiverManager), Some(self.tenant_shard_id), Some(self.timeline_id) ) ``` # Refs refs #7062

## Problem We don't want to run an excessive e2e test suite on neonvm if there are no relevant changes. ## Summary of changes - Check PR diff and if there are no relevant compute changes (in `vendor/`, `pgxn/`, `libs/vm_monitor` or `Dockerfile.compute-node` - Switch job from `small` to `ubuntu-latest` runner to make it possible to use GitHub CLI

Reverts #7052

github-actions · 2024-03-27T19:03:01Z

2730 tests run: 2591 passed, 0 failed, 139 skipped (full report)

Code coverage* (full report)

functions: 28.2% (6307 of 22367 functions)
lines: 47.0% (44289 of 94303 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
24c5a5a at 2024-03-27T19:03:01.180Z :recycle:}

danieltprice · 2024-03-28T13:18:40Z

Reviewed for 04-29-2024 changelog. Nothing to add.

VladLazar and others added 11 commits March 26, 2024 14:35

test_runner/performance: skip test_bulk_insert (#7238)

3426619

## Problem `test_bulk_insert` becomes too slow, and it fails constantly: #7124 ## Summary of changes - Skip `test_bulk_insert` until it's fixed

Revert "Revoke REPLICATION" (#7261)

24c5a5a

Reverts #7052

vipvap requested review from a team as code owners March 27, 2024 18:16

vipvap requested review from khanova, VladLazar, NanoBjorn and conradludgate and removed request for a team March 27, 2024 18:16

skyzh changed the title ~~Release 2024-03-27~~ Release 2024-03-27 - compute only release Mar 27, 2024

skyzh approved these changes Mar 27, 2024

View reviewed changes

skyzh merged commit c431e2f into release Mar 27, 2024
104 checks passed

skyzh deleted the rc/2024-03-27 branch March 27, 2024 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2024-03-27 - compute only release #7263

Release 2024-03-27 - compute only release #7263

vipvap commented Mar 27, 2024

github-actions bot commented Mar 27, 2024

danieltprice commented Mar 28, 2024

Release 2024-03-27 - compute only release #7263

Release 2024-03-27 - compute only release #7263

Conversation

vipvap commented Mar 27, 2024