Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix races around pthread exit and join #409

Merged
merged 2 commits into from
Jun 7, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,7 @@ ifeq ($(THREAD_MODEL), posix)
# Specify the tls-model until LLVM 15 is released (which should contain
# https://reviews.llvm.org/D130053).
CFLAGS += -mthread-model posix -pthread -ftls-model=local-exec
ASMFLAGS += -matomics

# Include cloudlib's directory to access the structure definition of clockid_t
CFLAGS += -I$(LIBC_BOTTOM_HALF_CLOUDLIBC_SRC)
Expand Down
22 changes: 8 additions & 14 deletions libc-top-half/musl/src/thread/pthread_create.c
Original file line number Diff line number Diff line change
Expand Up @@ -164,14 +164,6 @@ static void __pthread_exit(void *result)
self->prev->next = self->next;
self->prev = self->next = self;

#ifndef __wasilibc_unmodified_upstream
/* On Linux, the thread is created with CLONE_CHILD_CLEARTID,
* and this lock will unlock by kernel when this thread terminates.
* So we should unlock it here in WebAssembly.
* See also set_tid_address(2) */
__tl_unlock();
#endif
abrown marked this conversation as resolved.
Show resolved Hide resolved

#ifdef __wasilibc_unmodified_upstream
if (state==DT_DETACHED && self->map_base) {
/* Detached threads must block even implementation-internal
Expand All @@ -190,9 +182,6 @@ static void __pthread_exit(void *result)
}
#else
if (state==DT_DETACHED && self->map_base) {
// __syscall(SYS_exit) would unlock the thread, list
// do it manually here
__tl_unlock();
free(self->map_base);
// Can't use `exit()` here, because it is too high level
return;
Expand All @@ -212,10 +201,15 @@ static void __pthread_exit(void *result)
#ifdef __wasilibc_unmodified_upstream
for (;;) __syscall(SYS_exit, 0);
#else
// __syscall(SYS_exit) would unlock the thread, list
// do it manually here
__tl_unlock();
// Can't use `exit()` here, because it is too high level

/* On Linux, the thread is created with CLONE_CHILD_CLEARTID,
* and the lock (__thread_list_lock) will be unlocked by kernel when
* this thread terminates.
* See also set_tid_address(2)
*
* In WebAssembly, we leave it to wasi_thread_start instead.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead call __tl_unlock right before a_store(&self->detach_state, DT_EXITED); like we do right be free(self->map_base);?

In the first case we are unlocking right before notifying the joiner that they can call free and in the later case we are unlocking right before we call free ourselves, so it seem symmetrical.

(I'm suggesting this so that we can avoid re-implementing __tl_unlock in asm if we can).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what was wrong with the single __tl_unlock on line 172 that we had previously?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead call __tl_unlock right before a_store(&self->detach_state, DT_EXITED); like we do right be free(self->map_base);?

In the first case we are unlocking right before notifying the joiner that they can call free and in the later case we are unlocking right before we call free ourselves, so it seem symmetrical.

  • the detached case is broken as alex said.
  • detached threads do never have joiners.

(I'm suggesting this so that we can avoid re-implementing __tl_unlock in asm if we can).

this PR doesn't re-implement __tl_unlock.
it emulates CLONE_CHILD_CLEARTID, which the __tl_lock/unlock/sync protocol relies on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, what was wrong with the single __tl_unlock on line 172 that we had previously?

  • double unlock.
  • it unlocks too early and allows joiner to free our stack before we finish on it.

Copy link
Member

@sbc100 sbc100 Apr 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the double unlock thing looks like a bug.

But the joiner is not waiting on the tl_lock is it? The joiner seems to be waiting on t->detach_state. In fact, I don't see the tl_lock referenced at all in pthread_join.c. Maybe I'm missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the joiner uses __td_sync to sync with the exit.

#endif
}

Expand Down
16 changes: 16 additions & 0 deletions libc-top-half/musl/src/thread/wasm32/wasi_thread_start.s
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,20 @@ wasi_thread_start:
local.get 1 # start_arg
call __wasi_thread_start_C

# Unlock thread list. (as CLONE_CHILD_CLEARTID would do for Linux)
#
# Note: once we unlock the thread list, our "map_base" can be freed
# by a joining thread. It's safe as we are in ASM and no longer use
# our C stack or pthread_t.
i32.const __thread_list_lock
i32.const 0
i32.atomic.store 0
# As an optimization, we can check tl_lock_waiters here.
# But for now, simply wake up unconditionally as
# CLONE_CHILD_CLEARTID does.
i32.const __thread_list_lock
i32.const 1
memory.atomic.notify 0
drop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about putting all this new code a new local function called __tl_unlock_asm ? And perhaps mention in the comment why we need to asm re-implementation here.

Otherwise this lgtm now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a comment.


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading over the definition of __tl_unlock it additionally decrements tl_lock_count. It might be good to assert that is zero in C before returning here to perform this. Additionally it might be good to leave a comment in __tl_unlock that this needs updating if the implementation changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call __tl_unlock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading over the definition of __tl_unlock it additionally decrements tl_lock_count. It might be good to assert that is zero in C before returning here to perform this. Additionally it might be good to leave a comment in __tl_unlock that this needs updating if the implementation changes.

i feel it belongs to upstream musl, not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call __tl_unlock?

because __tl_unlock is a C function, which potentially requires C stack.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by it belongs in upstream musl? Wasi-libc is already patching musl a fair bit, so what I'm saying is that as part of the patching the cases where __pthread_exit returns should all assert that the tl_lock_count is zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by it belongs in upstream musl? Wasi-libc is already patching musl a fair bit, so what I'm saying is that as part of the patching the cases where __pthread_exit returns should all assert that the tl_lock_count is zero.

i meant tl_lock_count should be zero when a thread exits is not wasi-specific.
if you feel strongly i can add such an assertion.

end_function