-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MONO][Marshal] Fix race condition in marshal callback installation #77383
Conversation
/azp run runtime-extra-platforms |
/azp run runtime-wasm |
Azure Pipelines successfully started running 1 pipeline(s). |
1 similar comment
Azure Pipelines successfully started running 1 pipeline(s). |
|
||
void | ||
mono_install_marshal_callbacks_ilgen (MonoMarshalIlgenCallbacks *cb) | ||
{ | ||
g_assert (!ilgen_cb_inited); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the race occur ? This should be called during startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vargaz @naricc since e107bcb, mono_marshal_ilgen_init
may be called by embedders early, but it doesn't have to be called. In the case where the driver doesn't call it early (for example the normal desktop mono driver doesn't call it), get_marshal_cb
will initialize lazily. The race is in the lazy case where one thread can see ilgen_cb_inited == TRUE
, but the callbacks are all still null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it get called during runtime startup ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it get called during runtime startup ?
I was surprised, but apparently it doesn't. Going back to the very first PR it's always been lazy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now on wasm at least, all such initialization needs to happen before the runtime is initialized, i.e.
https://github.com/dotnet/runtime/blob/main/src/mono/wasm/runtime/driver.c#L593
When mono_jit_init_version () is called, it can check whenever the embedder has made some custom changes, and if not, initialize things the default way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I don't have any objection to just using mono_marshal_ilgen_init
to set a flag and doing all the callback initialization late during startup. @naricc I think that's what you wanted to do for the component in the first place right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lambdageek That is what I am doing for the componentization change.
So is that what we should do here too, instead of the thing with atomics? mono_marshal_ilgen_init
sets a flag; we check that flag during, say, mini_init()
. At which point we do callback installation if it is set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in the case that we are using NOILGEN and no one has called mono_marshal_ilgen_init
we just want to go ahead and install the noilgen callbacks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that sounds right.
So is that what we should do here too, instead of the thing with atomics? mono_marshal_ilgen_init sets a flag; we check that flag during, say, mini_init(). At which point we do callback installation if it is set?
Yea, basically at the point where in the componentized versino we'd call the component init function, here we'd just set up the callbacks.
I think writing down #77383 (comment) helped me to think about what the real problem is. It's not concurrency. It's that we have a confusing API for choosing between ilgen/noilgen. We should init the marshaling stack at startup - there's no need for it to be lazy. You were right in your proposal about mono_marshal_ilgen_init
(just use it to set a flag), but I didn't understand the problem properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. needs barriers. also some style nits
memcpy (&ilgen_marshal_cb, cb, sizeof (MonoMarshalIlgenCallbacks)); | ||
ilgen_cb_inited = TRUE; | ||
} | ||
MonoMarshalIlgenCallbacks* local_cb = (MonoMarshalIlgenCallbacks*)malloc(sizeof(MonoMarshalIlgenCallbacks)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use g_malloc
/g_free
|
||
if (mono_atomic_cas_ptr((void**)&ilgen_marshal_cb, local_cb, NULL )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (mono_atomic_cas_ptr((void**)&ilgen_marshal_cb, local_cb, NULL )) | |
if (mono_atomic_cas_ptr((void**)&ilgen_marshal_cb, local_cb, NULL ) != NULL) |
nit: usually the pattern is if (mono_atomic_cas_ptr (dest, newValue, expectedValue) != expectedValue)
ilgen_cb_inited = TRUE; | ||
} | ||
MonoMarshalIlgenCallbacks* local_cb = (MonoMarshalIlgenCallbacks*)malloc(sizeof(MonoMarshalIlgenCallbacks)); | ||
memcpy (local_cb, cb, sizeof (MonoMarshalIlgenCallbacks)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a memory barrier between the memcpy and the CAS
memcpy (local_cb, cb, sizeof (MonoMarshalIlgenCallbacks)); | |
memcpy (local_cb, cb, sizeof (MonoMarshalIlgenCallbacks)); | |
mono_memory_barrier (); |
src/mono/mono/metadata/marshal.c
Outdated
MonoMarshalLightweightCallbacks* local_cb = (MonoMarshalLightweightCallbacks*)malloc(sizeof(MonoMarshalLightweightCallbacks)); | ||
memcpy (local_cb, cb, sizeof (MonoMarshalLightweightCallbacks)); | ||
|
||
if (mono_atomic_cas_ptr((void**)&marshal_lightweight_cb, local_cb, NULL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comments as the other method:
- use
g_malloc
/g_free
- barrier after memcpy
- add
!= NULL
to theif
condition
Please use a better summary. something like |
I guess we're going with a flag instead of atomics
I am abandoing this PR in favor of doing it the flag way ( see discussion above). That is a different change, so I will do it in a different PR. |
This fixes a race condition on
ilgen_cb_inited
; this flag was used to determine if callbacks were already installed, but had no synchronization around it. Replaced with a pointer to a dynamically allocated buffer and acas
.Fixes: #74603
Fixes: #77090