Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition on marshal-ilgen init #74603

Closed
naricc opened this issue Aug 25, 2022 · 8 comments · Fixed by #77448
Closed

Race condition on marshal-ilgen init #74603

naricc opened this issue Aug 25, 2022 · 8 comments · Fixed by #77448
Assignees
Milestone

Comments

@naricc
Copy link
Member

naricc commented Aug 25, 2022

Description

Because of this change: #71203, mono_marshal_ilgen_init gets called later, after comnponents are initilized. This allows for potential races in initilizing the callback table.

I propose to fix this by making a pointer to ilgen_marshal_cb, and initing that pointer atomically (with a CAS) after the table has been constructed.

See comment here: #71203 (comment)

Reproduction Steps

Race condition; not easily reproducible.

Expected behavior

Runs

Actual behavior

Assertion at /home/uweigand/runtime/src/mono/mono/component/marshal-ilgen.c:63, condition `!ilgen_cb_inited' not met

with this call stack:

  #14 0x000003ffb327d0ac in mono_assertion_message (file=0x3ffb33865a0 "/home/uweigand/runtime/src/mono/mono/component/marshal-ilgen.c", line=63, condition=0x3ffb33865e0 "!ilgen_cb_inited") at /home/uweigand/runtime/src/mono/mono/eglib/goutput.c:226
  #15 0x000003ffb3321a30 in mono_install_marshal_callbacks_ilgen (cb=0x3ff9a3f16d0) at /home/uweigand/runtime/src/mono/mono/component/marshal-ilgen.c:63
  #16 0x000003ffb33221ce in mono_marshal_ilgen_init () at /home/uweigand/runtime/src/mono/mono/component/marshal-ilgen.c:2858
  #17 0x000003ffb3322106 in get_marshal_cb () at /home/uweigand/runtime/src/mono/mono/component/marshal-ilgen.c:2753
  #18 0x000003ffb3321f24 in mono_emit_marshal_ilgen (m=0x3ff9a3f1bc8, argnum=0, t=0x3ff8ad98b68, spec=0x0, conv_arg=0, conv_arg_type=0x3ff8ad99ed8, action=MARSHAL_ACTION_CONV_IN, lightweigth_cb=0x3ffb33910e8 <marshal_lightweight_cb>) at /home/uweigand/runtime/src/mono/mono/component/marshal-ilgen.c:2810
  #19 0x000003ffb2e54c52 in mono_emit_marshal (m=0x3ff9a3f1bc8, argnum=0, t=0x3ff8ad98b68, spec=0x0, conv_arg=0, conv_arg_type=0x3ff8ad99ed8, action=MARSHAL_ACTION_CONV_IN) at /home/uweigand/runtime/src/mono/mono/metadata/marshal.c:3226
  #20 0x000003ffb2f219aa in emit_native_wrapper_ilgen (image=0x2aa18a8b950, mb=0x3ff7c90eb50, sig=0x3ff8ad99cc8, piinfo=0x3ff8ad98af0, mspecs=0x3ff7c8f47e0, func=0x3ff82d81750 <CompressionNative_DeflateInit2_>, flags=(EMIT_NATIVE_WRAPPER_CHECK_EXCEPTIONS | EMIT_NATIVE_WRAPPER_RUNTIME_MARSHALLING_ENABLED)) at /home/uweigand/runtime/src/mono/mono/metadata/marshal-lightweight.c:937
  #21 0x000003ffb2e5755a in mono_marshal_emit_native_wrapper (image=0x2aa18a8b950, mb=0x3ff7c90eb50, sig=0x3ff8ad99cc8, piinfo=0x3ff8ad98af0, mspecs=0x3ff7c8f47e0, func=0x3ff82d81750 <CompressionNative_DeflateInit2_>, flags=(EMIT_NATIVE_WRAPPER_CHECK_EXCEPTIONS | EMIT_NATIVE_WRAPPER_RUNTIME_MARSHALLING_ENABLED)) at /home/uweigand/runtime/src/mono/mono/metadata/marshal.c:6300
  #22 0x000003ffb2e56b6c in mono_marshal_get_native_wrapper (method=0x3ff8ad98af0, check_exceptions=1, aot=0) at /home/uweigand/runtime/src/mono/mono/metadata/marshal.c:3672
  #23 0x000003ffb30328e8 in compile_special (method=0x3ff8ad98af0, error=0x3ff9a3f2c38) at /home/uweigand/runtime/src/mono/mono/mini/mini-runtime.c:2469
  #24 0x000003ffb30322b0 in mono_jit_compile_method_with_opt (method=0x3ff8ad98af0, opt=374417919, jit_only=0, error=0x3ff9a3f2c38) at /home/uweigand/runtime/src/mono/mono/mini/mini-runtime.c:2672
  #25 0x000003ffb3031c04 in jit_compile_method_with_opt_cb (arg=0x3ff9a3f28e8) at /home/uweigand/runtime/src/mono/mono/mini/mini-runtime.c:2762
  #26 0x000003ffb3029ea8 in jit_compile_method_with_opt (params=0x3ff9a3f28e8) at /home/uweigand/runtime/src/mono/mono/mini/mini-runtime.c:2778
  #27 0x000003ffb3029c7e in mono_jit_compile_method (method=0x3ff8ad98af0, error=0x3ff9a3f2c38) at /home/uweigand/runtime/src/mono/mono/mini/mini-runtime.c:2797
  #28 0x000003ffb31677ba in common_call_trampoline (regs=0x3ff9a3f2db8, code=0x3ffa806234c "\a\340\300\001", m=0x3ff8ad98af0, vt=0x0, vtable_slot=0x0, error=0x3ff9a3f2c38) at /home/uweigand/runtime/src/mono/mono/mini/mini-trampolines.c:618
  #29 0x000003ffb3166b8e in mono_magic_trampoline (regs=0x3ff9a3f2db8, code=0x3ffa806234c "\a\340\300\001", arg=0x3ff8ad98af0, tramp=0x3ffb3de02c8 "\300X") at /home/uweigand/runtime/src/mono/mono/mini/mini-trampolines.c:759

Regression?

No response

Known Workarounds

No response

Configuration

ppc, but probably possible on other architectures

Other information

No response

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 25, 2022
@naricc naricc added this to the 7.0.0 milestone Aug 25, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 25, 2022
@naricc naricc added blocking-release untriaged New issue has not been triaged by the area owner and removed untriaged New issue has not been triaged by the area owner labels Aug 25, 2022
@naricc
Copy link
Member Author

naricc commented Aug 25, 2022

@uweigand I've created this issues based on your comment from the PR. Thanks for finding that.

Is this blocking you? If it is an immediate blocker I will jump on this right now, otherwise I will wait until early next week.

@naricc naricc self-assigned this Aug 25, 2022
@uweigand
Copy link
Contributor

No, it's not a blocker - I just saw this on one testsuite run, and haven't been able to reproduce since.

Thanks for looking into this!

@SamMonoRT
Copy link
Member

cc @lambdageek fyi

@SamMonoRT
Copy link
Member

reverted the original change from 7.0. Moving this to 8.0

@lambdageek
Copy link
Member

#77090 seems like the same issue - the race is there even before the componentization work. It's worth fixing and backporting.

@SamMonoRT
Copy link
Member

@naricc - we should add the fix for this in your existing PR for completeness

@naricc
Copy link
Member Author

naricc commented Oct 21, 2022

Working on taking the fix from the marshal-component change and applying it independent of that change, so we can easily backport it to 7.

@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Oct 24, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Oct 25, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 31, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 30, 2022
@jandupej
Copy link
Member

Backporting to 7.0 in #83813

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.