Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(compute_ctl): race condition in configurator #9162

Merged
merged 2 commits into from
Sep 26, 2024

Conversation

petuhovskiy
Copy link
Member

@petuhovskiy petuhovskiy commented Sep 26, 2024

There was a tricky race condition in compute_ctl, that sometimes makes configurator skip updates. It makes a deadlock because:

  • control-plane cannot configure compute, because it's in ConfigurationPending state
  • compute_ctl doesn't do any reconfiguration because configurator_main_loop missed notification for it

Full sequence that reproduces the issue:

  1. start_compute finishes works and changes status self.set_status(ComputeStatus::Running);
  2. configurator received update about Running state and dropped the mutex lock in the iteration
  3. /configure request was triggered at the same time as step 1, and got the mutex lock
  4. same /configure request set the spec and updated the state to ConfigurationPending, also sent a notification
  5. next iteration in configurator got the mutex lock, but missed the notification

There are more details in this slack thread: https://neondb.slack.com/archives/C03438W3FLZ/p1727281028478689?thread_ts=1727261220.483799&cid=C03438W3FLZ

patch author: @ololobus

Copy link

github-actions bot commented Sep 26, 2024

5013 tests run: 4849 passed, 0 failed, 164 skipped (full report)


Flaky tests (5)

Postgres 17

Postgres 16

Postgres 15

Code coverage* (full report)

  • functions: 32.0% (7494 of 23396 functions)
  • lines: 50.0% (60473 of 120865 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
dbed75f at 2024-09-26T11:55:51.290Z :recycle:

Copy link
Member

@ololobus ololobus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a comment already written locally :) I think it's worth of elaborating a bit

compute_tools/src/configurator.rs Outdated Show resolved Hide resolved
Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>
@petuhovskiy petuhovskiy merged commit 80e974d into main Sep 26, 2024
83 checks passed
@petuhovskiy petuhovskiy deleted the fix-compute-configurator-rc branch September 26, 2024 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants