Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control_plane: revise compute_hook locking (don't serialise all calls) #7088

Merged
merged 4 commits into from
Apr 6, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Mar 11, 2024

Problem

  • Previously, an async mutex was held for the duration of ComputeHook::notify. This served multiple purposes:
    • Ensure updates to a given tenant are sent in the proper order
    • Prevent concurrent calls into neon_local endpoint updates in test environments (neon_local is not safe to call concurrently)
    • Protect the inner ComputeHook::state hashmap that is used to calculate when to send notifications.

This worked, but had the major downside that while we're waiting for a compute hook request to the control plane to succeed, we can't notify about any other tenants. Notifications block progress of live migrations, so this is a problem.

Summary of changes

  • Protect ComputeHook::state with a sync lock instead of an async lock
  • Use a separate async lock ( ComputeHook::neon_local_lock ) for preventing concurrent calls into neon_local, and only take this in the neon_local code path.
  • Add per-tenant async locks in ShardedComputeHookTenant, and use these to ensure that only one remote notification can be sent at once per tenant. If several shards update concurrently, their updates will be coalesced.
  • Add an explicit semaphore that limits concurrency of calls into the cloud control plane.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added t/feature Issue type: feature, for new features or requests c/storage/controller Component: Storage Controller labels Mar 11, 2024
@jcsp jcsp requested a review from a team as a code owner March 11, 2024 14:30
@jcsp jcsp requested a review from arpad-m March 11, 2024 14:30
Copy link

github-actions bot commented Mar 11, 2024

2802 tests run: 2678 passed, 0 failed, 124 skipped (full report)


Flaky tests (3)

Postgres 14

  • test_timeline_init_break_before_checkpoint: debug
  • test_empty_branch_remote_storage_upload: debug
  • test_branched_from_many_empty_parents_size: debug

Code coverage* (full report)

  • functions: 28.0% (6406 of 22857 functions)
  • lines: 46.9% (45098 of 96230 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
cf1b476 at 2024-04-06T20:07:07.354Z :recycle:

@jcsp jcsp marked this pull request as draft March 28, 2024 21:09
@jcsp jcsp force-pushed the jcsp/storcon-compute-hook-concurrency branch from b139bd9 to f049ea1 Compare March 28, 2024 21:10
@jcsp jcsp force-pushed the jcsp/storcon-compute-hook-concurrency branch from f049ea1 to 283dbbc Compare April 2, 2024 16:55
@jcsp jcsp marked this pull request as ready for review April 2, 2024 16:55
@jcsp
Copy link
Collaborator Author

jcsp commented Apr 2, 2024

This didn't reduce down as neatly as I had hoped, but I think the logic is sound: we try to acquire a ComputeHookTenant's async lock before sending an update, and if we can't acquire it we drop out, wait for the lock, and try again. The request we send is always derived from latest state of the ComputeHookTenant, so there is no longer any sensitivity to which order notify() calls complete in.

I considered using an async lock for ComputeHook::state to make the control flow a bit less tortured, but on balance I prefer to use a sync lock so that the compiler enforces that we do not hold it while waiting for IO.

@jcsp jcsp requested a review from problame April 5, 2024 10:43
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code has no locking bugs, but, it's all quite brittle / hard to follow.

My only strong request for chagnes is the #7088 (comment)

The naming changes would be pretty nice, but, if you don't have time for them, I can do them in a follow-up.

control_plane/attachment_service/src/compute_hook.rs Outdated Show resolved Hide resolved
control_plane/attachment_service/src/compute_hook.rs Outdated Show resolved Hide resolved
control_plane/attachment_service/src/compute_hook.rs Outdated Show resolved Hide resolved
control_plane/attachment_service/src/compute_hook.rs Outdated Show resolved Hide resolved
control_plane/attachment_service/src/compute_hook.rs Outdated Show resolved Hide resolved
@jcsp
Copy link
Collaborator Author

jcsp commented Apr 5, 2024

Comments on names/structs are all quite reasonable, amended in cf1b476

@jcsp jcsp enabled auto-merge (squash) April 5, 2024 17:00
@jcsp jcsp merged commit 74b2314 into main Apr 6, 2024
53 checks passed
@jcsp jcsp deleted the jcsp/storcon-compute-hook-concurrency branch April 6, 2024 19:52
jcsp added a commit that referenced this pull request Apr 29, 2024
…7495)

## Problem

Previously, we try to send compute notifications in startup_reconcile
before completing that function, with a time limit. Any notifications
that don't happen within the time limit result in tenants having their
`pending_compute_notification` flag set, which causes them to spawn a
Reconciler next time the background reconciler loop runs.

This causes two problems:
- Spawning a lot of reconcilers after startup caused a spike in memory
(this is addressed in #7493)
- After #7493, spawning lots of
reconcilers will block some other operations, e.g. a tenant creation
might fail due to lack of reconciler semaphore units while the
controller is busy running all the Reconcilers for its startup compute
notifications.

When the code was first written, ComputeHook didn't have internal
ordering logic to ensure that notifications for a shard were sent in the
right order. Since that was added in
#7088, we can use it to avoid
waiting for notifications to complete in startup_reconcile.

Related to: #7460

## Summary of changes

- Add a `notify_background` method to ComputeHook.
- Call this from startup_reconcile instead of doing notifications inline
- Process completions from `notify_background` in `process_results`, and
if a notification failed then set the `pending_compute_notification`
flag on the shard.

The result is that we will only spawn lots of Reconcilers if the compute
notifications _fail_, not just because they take some significant amount
of time.

Test coverage for this case is in
#7475
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/controller Component: Storage Controller t/feature Issue type: feature, for new features or requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants