Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable early-out notify optimizations #139

Merged
merged 1 commit into from
May 28, 2024
Merged

Disable early-out notify optimizations #139

merged 1 commit into from
May 28, 2024

Conversation

notgull
Copy link
Member

@notgull notgull commented May 27, 2024

The notify() function contains two optimizations:

  • If the inner field is not yet initialized (i.e. no listeners have been
    created yet), it returns 0 from the function early.
  • If the atomic notified field indicates that the current notification would
    do nothing, it returns 0 early as well.

loom testing has exposed race conditions in these optimizations that can cause
notifications to be missed, leading to deadlocks. This commit removes these
optimizations in the hope of preventing these deadlocks. Ideally we can restore
them in the future.

Closes #138
cc #130 and #133

The notify() function contains two optimizations:

- If the `inner` field is not yet initialized (i.e. no listeners have been
  created yet), it returns `0` from the function early.
- If the atomic `notified` field indicates that the current notification would
  do nothing, it returns `0` early as well.

`loom` testing has exposed race conditions in these optimizations that can cause
notifications to be missed, leading to deadlocks. This commit removes these
optimizations in the hope of preventing these deadlocks. Ideally we can restore
them in the future.

Closes #138
cc #130 and #133

Signed-off-by: John Nunley <dev@notgull.net>
@notgull
Copy link
Member Author

notgull commented May 27, 2024

r? @zeenix, as this is the conclusion of the deep dive I've been doing for the past couple weeks, as mentioned here: #130 (comment)

r? @fogti, as per the C memory model I'm relatively sure these optimizations should have any effect on the order of events, but I'm not sure.

@notgull
Copy link
Member Author

notgull commented May 27, 2024

I've tested async-lock with a patched version of event-listener with this PR and the MIRI tests pass, even with the previously ignored tests un-ignored. In addition I'm relatively sure it passes Loom tests (smol-rs/async-lock#86). However I can't be 100% sure; my max-preemptions count is set very low as I do not have access to hardware that can conduct full loom tests at the moment.

I'm unclear why this bug is manifesting now; pretty much all versions of event-listener contain this code and v5 didn't really touch it. Maybe it just offset the timing just right for the race condition to start showing up in normal code.

Copy link
Member

@zeenix zeenix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work figuring this one! 👍 Wouldn't removing these optimizations mean that we'll wake up tasks unnecessarily? 🤔

@notgull
Copy link
Member Author

notgull commented May 27, 2024

No, these optimizations were there to add fast paths that didn't lock the inner mutex and traverse the linked list. This operation is somewhat expensive so this optimization avoided doing that if it could. Removing it won't wake up any additional tasks.

@notgull notgull requested a review from zeenix May 28, 2024 00:12
Copy link
Member

@zeenix zeenix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. LGTM then.

@notgull notgull merged commit 85e3474 into master May 28, 2024
10 checks passed
@notgull notgull deleted the notgull/no-opt branch May 28, 2024 23:01
@notgull notgull mentioned this pull request May 29, 2024
src/lib.rs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

During racy initialization, a race can happen where a notification is dropped
4 participants