Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mjit] Initialize MonoVTable and MonoClass one level up #28

Merged
merged 2 commits into from
Jul 30, 2018

Conversation

luhenry
Copy link
Collaborator

@luhenry luhenry commented Jul 27, 2018

No description provided.

@luhenry luhenry requested a review from kumpera as a code owner July 27, 2018 16:44
if (code_length)
*code_length = cfg->code_len;

mono_destroy_compile (cfg);

g_assert (code);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

MonoObject *compiler, *method_info, *ret;
MonoNativeCodeHandle native_code;
gpointer params[4];
gboolean bootstrapping;

bootstrapping = is_mjit_compiling;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. is bootstrapping ever true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when you are compiling the ICompiler.CompileMethod for example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, it's still not obvious to me what happens here:

  • first call of mjit_compile_method_inner, let's say for method Foo () -> is_mjit_compiling = FALSE
  • bootstrapping = is_mjit_compiling = FALSE
  • since !bootstrapping, we set is_mjit_compiling = TRUE
  • bootstrapping is still FALSE, so we are using mono_runtime_invoke_checked ()
  • that means we end up in mjit_compile_method_inner again, this time with bootstrapping = is_mjit_compiling = TRUE, and therefore return a "fake compiled entry" of MethodInfo.ctor()/ICompiler.CompileMethod(). By "fake compiled entry" I mean a function pointer that calls into the interpreter again, and executes said method.
  • we are back to the first call of mjit_compile_method_inner and compile Foo() with the managed JIT (the managed JIT runs in the interpreter).
  • since !bootstrapping is still the case we unsert is_mjit_compiling when we are done for Foo().

So either:

  1. we anyway run the managed JIT always in the interpreter, then I don't get the point of this change, as it seems to be a complicated flow of things a lot with no change.
  2. Or, I'm missing something. Please explain 🙂

Copy link
Collaborator Author

@luhenry luhenry Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lewurm

  • first call of mjit_compile_method_inner, let's say for method Foo () -> is_mjit_compiling = FALSE
  • bootstrapping = is_mjit_compiling = FALSE
  • since !bootstrapping, we set is_mjit_compiling = TRUE
  • bootstrapping is still FALSE, so we are using mono_runtime_invoke_checked () to invoke MiniCompiler.CompileMethod(method = "Foo")
    • if we end up in mjit_compile_method_inner again, it means we are compiling MiniCompiler.CompileMethod (with MiniCompiler => bootstrapping) -> we invoke with the Interpreter so we don't end up in mjit_compile_method_inner again.
  • we are back to the first call of mjit_compile_method_inner and compile Foo() with the managed JIT which runs as JITted code because we compiled it while running in the Interpreter.
  • since !bootstrapping is still the case we unsert is_mjit_compiling when we are done for Foo().

Where this method needs improvement is in the case where we are trying to compile 1 of the method called in the call chain of MiniCompiler.CompileMethod, we will run the whole MiniCompiler.CompileMethod for that method in the Interpreter, even if we have some pieces already JITted. That's where your note on tiered compilation comes into play and it would make it work better, but that's not in the spectrum right now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, missed that one bit. Thanks for being patient with me and taking time to explain it 🙂 very cool 👍

@luhenry luhenry merged commit c9d100a into lambdageek:mjit Jul 30, 2018
lambdageek pushed a commit that referenced this pull request Jun 25, 2019
We recently switched the AOT compiler for iOS armv7 from 32bit host to 64bit host. At the same time we _also_ switched from LLVM 3.6 to the newer LLVM 6.0 fork we support.

In a Xamarin.iOS Release build for iOS armv7 exception handling would crash. Specifically the problem was that the exception object wasn't properly injected into the handler block. The way it works is, that the exception handler machinery passes the pointer to the exception object via `r0`.

With the newer LLVM the generated code of the exception handling block
looks like this:
```
LBB1_15:                                @ %EH_CLAUSE0_BB3
Ltmp8:
	ldr	r0, [sp, #24]
	str	r0, [sp, #4]
	ldr	r0, [sp, #28]
	ldr	r0, [sp, #12]
	ldr	r0, [r0]
	cmp	r0, #0
	beq	LBB1_17
```
thus overwriting `r0`, so that the pointer to the exception object is lost.  With the older LLVM 3.6 the generated code looks like this:
```
LBB1_8:                                 @ %EH_CLAUSE0_BB3
Ltmp8:
	str	r0, [sp, #4]
	ldr	r0, [r6]
	cmp	r0, #0
	beq	LBB1_10
```
correctly storing the exception object into a stack slot for later usage.

After some time I figured out that there are _different_ exception handling models. Depending on the model, the exception pointer is passed in `r0` or not:
https://github.com/mono/llvm/blob/2bd2f1db1803f7b36687e5abf912c69baa848305/lib/Target/ARM/ARMISelLowering.cpp#L14416-L14428

`SjLj` stands for "SetJump / LongJump" and  means that exception handling is implemented with this. On iOS this is the default model that is used. We want `dwarf` instead, so we tell this `llc` now and the generated code handles it correctly.

So why was it working before? In our older LLVM 3.6 fork we hardcoded it:
https://github.com/mono/llvm/blob/f80899cb3eb75f7f5640b4519e83bd96991bffb8/lib/Target/ARM/ARMISelLowering.cpp#L756-L762

The `-exception-model=` option is also not available in LLVM 3.6.

Contributes to mono#15058 and mono#9621
lambdageek pushed a commit that referenced this pull request Aug 19, 2019
We recently switched the AOT compiler for iOS armv7 from 32bit host to 64bit host. At the same time we _also_ switched from LLVM 3.6 to the newer LLVM 6.0 fork we support.

In a Xamarin.iOS Release build for iOS armv7 exception handling would crash. Specifically the problem was that the exception object wasn't properly injected into the handler block. The way it works is, that the exception handler machinery passes the pointer to the exception object via `r0`.

With the newer LLVM the generated code of the exception handling block
looks like this:
```
LBB1_15:                                @ %EH_CLAUSE0_BB3
Ltmp8:
	ldr	r0, [sp, #24]
	str	r0, [sp, #4]
	ldr	r0, [sp, #28]
	ldr	r0, [sp, #12]
	ldr	r0, [r0]
	cmp	r0, #0
	beq	LBB1_17
```
thus overwriting `r0`, so that the pointer to the exception object is lost.  With the older LLVM 3.6 the generated code looks like this:
```
LBB1_8:                                 @ %EH_CLAUSE0_BB3
Ltmp8:
	str	r0, [sp, #4]
	ldr	r0, [r6]
	cmp	r0, #0
	beq	LBB1_10
```
correctly storing the exception object into a stack slot for later usage.

After some time I figured out that there are _different_ exception handling models. Depending on the model, the exception pointer is passed in `r0` or not:
https://github.com/mono/llvm/blob/2bd2f1db1803f7b36687e5abf912c69baa848305/lib/Target/ARM/ARMISelLowering.cpp#L14416-L14428

`SjLj` stands for "SetJump / LongJump" and  means that exception handling is implemented with this. On iOS this is the default model that is used. We want `dwarf` instead, so we tell this `llc` now and the generated code handles it correctly.

So why was it working before? In our older LLVM 3.6 fork we hardcoded it:
https://github.com/mono/llvm/blob/f80899cb3eb75f7f5640b4519e83bd96991bffb8/lib/Target/ARM/ARMISelLowering.cpp#L756-L762

The `-exception-model=` option is also not available in LLVM 3.6.

Contributes to mono#15058 and mono#9621
lambdageek added a commit that referenced this pull request Sep 26, 2019
)

* [threads] clear small_id_key TLS when unregistering a thread

Fixes
```
* thread #12, name = 'tid_a507', stop reason = EXC_BREAKPOINT (code=1, subcode=0x1be66144)
  * frame #0: 0x1be66144 libsystem_c.dylib`__abort + 184
    frame #1: 0x1be6608c libsystem_c.dylib`abort + 152
    frame #2: 0x003e1fa0 monotouchtest`log_callback(log_domain=0x00000000, log_level="error", message="* Assertion at ../../../../../mono/utils/hazard-pointer.c:158, condition `mono_bitset_test_fast (small_id_table, id)' not met\n", fatal=4, user_data=0x00000000) at runtime.m:1251:3
    frame #3: 0x003abf44 monotouchtest`monoeg_g_logv_nofree(log_domain=0x00000000, log_level=G_LOG_LEVEL_ERROR, format=<unavailable>, args=<unavailable>) at goutput.c:149:2 [opt]
    frame #4: 0x003abfb4 monotouchtest`monoeg_assertion_message(format=<unavailable>) at goutput.c:184:22 [opt]
    frame #5: 0x003904dc monotouchtest`mono_thread_small_id_free(id=<unavailable>) at hazard-pointer.c:0:2 [opt]
    frame #6: 0x003a0a74 monotouchtest`unregister_thread(arg=0x15c88400) at mono-threads.c:588:2 [opt]
    frame #7: 0x00336110 monotouchtest`mono_thread_detach_if_exiting at threads.c:1571:4 [opt]
    frame #8: 0x003e7a14 monotouchtest`::xamarin_release_trampoline(self=0x166452f0, sel="release") at trampolines.m:644:3
    frame #9: 0x001cdc40 monotouchtest`::-[__Xamarin_NSTimerActionDispatcher release](self=0x166452f0, _cmd="release") at registrar.m:83445:3
    frame #10: 0x1ce2ae68 Foundation`_timerRelease + 80
    frame #11: 0x1c31b56c CoreFoundation`CFRunLoopTimerInvalidate + 612
    frame #12: 0x1c31a554 CoreFoundation`__CFRunLoopTimerDeallocate + 32
    frame #13: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #14: 0x1c2be6e8 CoreFoundation`__CFArrayReleaseValues + 500
    frame #15: 0x1c2be4c4 CoreFoundation`CFArrayRemoveAllValues + 104
    frame #16: 0x1c31ff64 CoreFoundation`__CFSetApplyFunction_block_invoke + 24
    frame #17: 0x1c3b2e18 CoreFoundation`CFBasicHashApply + 116
    frame #18: 0x1c31ff10 CoreFoundation`CFSetApplyFunction + 160
    frame #19: 0x1c3152cc CoreFoundation`__CFRunLoopDeallocate + 204
    frame #20: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #21: 0x1c304a80 CoreFoundation`__CFTSDFinalize + 144
    frame #22: 0x1bfa324c libsystem_pthread.dylib`_pthread_tsd_cleanup + 644
    frame #23: 0x1bf9cc08 libsystem_pthread.dylib`_pthread_exit + 80
    frame #24: 0x1bf9af24 libsystem_pthread.dylib`pthread_exit + 36
    frame #25: 0x0039df84 monotouchtest`mono_threads_platform_exit(exit_code=<unavailable>) at mono-threads-posix.c:145:2 [opt]
    frame #26: 0x0033bb84 monotouchtest`start_wrapper(data=0x1609e1c0) at threads.c:1296:2 [opt]
    frame #27: 0x1bf9b914 libsystem_pthread.dylib`_pthread_body + 128
    frame #28: 0x1bf9b874 libsystem_pthread.dylib`_pthread_start + 44
    frame #29: 0x1bfa3b94 libsystem_pthread.dylib`thread_start + 4
```

* Update mono/utils/mono-threads.c

Co-Authored-By: Aleksey Kliger (λgeek) <akliger@gmail.com>
lambdageek pushed a commit that referenced this pull request Oct 11, 2019
mono#17015)

[2019-08] [threads] clear small_id_key TLS when unregistering a thread

Fixes
```
* thread #12, name = 'tid_a507', stop reason = EXC_BREAKPOINT (code=1, subcode=0x1be66144)
  * frame #0: 0x1be66144 libsystem_c.dylib`__abort + 184
    frame #1: 0x1be6608c libsystem_c.dylib`abort + 152
    frame #2: 0x003e1fa0 monotouchtest`log_callback(log_domain=0x00000000, log_level="error", message="* Assertion at ../../../../../mono/utils/hazard-pointer.c:158, condition `mono_bitset_test_fast (small_id_table, id)' not met\n", fatal=4, user_data=0x00000000) at runtime.m:1251:3
    frame #3: 0x003abf44 monotouchtest`monoeg_g_logv_nofree(log_domain=0x00000000, log_level=G_LOG_LEVEL_ERROR, format=<unavailable>, args=<unavailable>) at goutput.c:149:2 [opt]
    frame #4: 0x003abfb4 monotouchtest`monoeg_assertion_message(format=<unavailable>) at goutput.c:184:22 [opt]
    frame #5: 0x003904dc monotouchtest`mono_thread_small_id_free(id=<unavailable>) at hazard-pointer.c:0:2 [opt]
    frame #6: 0x003a0a74 monotouchtest`unregister_thread(arg=0x15c88400) at mono-threads.c:588:2 [opt]
    frame #7: 0x00336110 monotouchtest`mono_thread_detach_if_exiting at threads.c:1571:4 [opt]
    frame #8: 0x003e7a14 monotouchtest`::xamarin_release_trampoline(self=0x166452f0, sel="release") at trampolines.m:644:3
    frame #9: 0x001cdc40 monotouchtest`::-[__Xamarin_NSTimerActionDispatcher release](self=0x166452f0, _cmd="release") at registrar.m:83445:3
    frame #10: 0x1ce2ae68 Foundation`_timerRelease + 80
    frame #11: 0x1c31b56c CoreFoundation`CFRunLoopTimerInvalidate + 612
    frame #12: 0x1c31a554 CoreFoundation`__CFRunLoopTimerDeallocate + 32
    frame #13: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #14: 0x1c2be6e8 CoreFoundation`__CFArrayReleaseValues + 500
    frame #15: 0x1c2be4c4 CoreFoundation`CFArrayRemoveAllValues + 104
    frame #16: 0x1c31ff64 CoreFoundation`__CFSetApplyFunction_block_invoke + 24
    frame #17: 0x1c3b2e18 CoreFoundation`CFBasicHashApply + 116
    frame #18: 0x1c31ff10 CoreFoundation`CFSetApplyFunction + 160
    frame #19: 0x1c3152cc CoreFoundation`__CFRunLoopDeallocate + 204
    frame #20: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #21: 0x1c304a80 CoreFoundation`__CFTSDFinalize + 144
    frame #22: 0x1bfa324c libsystem_pthread.dylib`_pthread_tsd_cleanup + 644
    frame #23: 0x1bf9cc08 libsystem_pthread.dylib`_pthread_exit + 80
    frame #24: 0x1bf9af24 libsystem_pthread.dylib`pthread_exit + 36
    frame #25: 0x0039df84 monotouchtest`mono_threads_platform_exit(exit_code=<unavailable>) at mono-threads-posix.c:145:2 [opt]
    frame #26: 0x0033bb84 monotouchtest`start_wrapper(data=0x1609e1c0) at threads.c:1296:2 [opt]
    frame #27: 0x1bf9b914 libsystem_pthread.dylib`_pthread_body + 128
    frame #28: 0x1bf9b874 libsystem_pthread.dylib`_pthread_start + 44
    frame #29: 0x1bfa3b94 libsystem_pthread.dylib`thread_start + 4
```


Debugged by @lambdageek. Contributes to mono#10641

Edit: C/p into PR description so it's in the commit message:

We think what happens is:

1. `start_wrapper` runs `unregister_thread` when the thread is exiting.  At that point the `small_id` for the thread is cleared from the hazard pointer bitset by `mono_thread_small_id_free (small_id)`

2. Some other TLS destructor runs and attaches the thread again (which runs `mono_thread_info_register_small_id` which first calls `mono_thread_info_get_small_id` which tries to get the small id from the `small_id_key` TLS key and so the new `MonoThreadInfo` has the same `small_id` as the previously destroyed `MonoThreadInfo` - but the hazard pointer bitset is **not** updated).

3. This other TLS destructor runs and calls `mono_thread_detach_if_exiting` which calls `unregister_thread` again.

4. `unregister_thread` calls `	mono_thread_small_id_free (small_id)` a second time which asserts because we already cleared that id from the hazard pointer bitset.



Backport of mono#16973.

/cc @lewurm
lambdageek pushed a commit that referenced this pull request Dec 11, 2019
When the debugger tries to suspend all the VM threads, it can happen with cooperative-suspend that it tries to suspend a thread that has previously been attached, but then did a "light" detach (that only unsets the domain). With the domain set to `NULL`, looking up a JitInfo for a given `ip` will result into a crash, but it isn't even necessarily needed for the purpose of suspending a thread.

More details: Consider the following:
```
(lldb) c
error: Process is running.  Use 'process interrupt' to pause execution.
Process 12832 stopped
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
    frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
   2713         MonoDomain *domain = (MonoDomain *) mono_thread_info_get_suspend_state (info)->unwind_data [MONO_UNWIND_DATA_DOMAIN];
   2714         if (!domain) {
   2715                 /* not attached */
-> 2716                 ji = NULL;
   2717         } else {
   2718                 ji = mono_jit_info_table_find_internal ( domain, MONO_CONTEXT_GET_IP (&mono_thread_info_get_suspend_state (info)->ctx), TRUE, TRUE);
   2719         }
Target 0: (WatchWCSessionAppWatchOSExtension) stopped.
(lldb) bt
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  * frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
    frame #1: 0x004e177b WatchWCSessionAppWatchOSExtension`mono_thread_info_safe_suspend_and_run(id=0xb0767000, interrupt_kernel=0, callback=(WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical at debugger-agent.c:2708), user_data=0xb04dc718) at mono-threads.c:1358:19
    frame #2: 0x00222799 WatchWCSessionAppWatchOSExtension`notify_thread(key=0x03994508, value=0x81378c00, user_data=0x00000000) at debugger-agent.c:2747:2
    frame #3: 0x00355bd8 WatchWCSessionAppWatchOSExtension`mono_g_hash_table_foreach(hash=0x8007b4e0, func=(WatchWCSessionAppWatchOSExtension`notify_thread at debugger-agent.c:2733), user_data=0x00000000) at mono-hash.c:310:4
    frame #4: 0x0021e955 WatchWCSessionAppWatchOSExtension`suspend_vm at debugger-agent.c:2844:3
    frame #5: 0x00225ff0 WatchWCSessionAppWatchOSExtension`process_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0, il_offset=0, ctx=0x00000000, events=0x805b28e0, suspend_policy=2) at debugger-agent.c:4012:3
    frame #6: 0x00227c7c WatchWCSessionAppWatchOSExtension`process_profiler_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0) at debugger-agent.c:4072:2
    frame #7: 0x0021b174 WatchWCSessionAppWatchOSExtension`thread_startup(prof=0x00000000, tid=2957889536) at debugger-agent.c:4149:2
    frame #8: 0x0037912d WatchWCSessionAppWatchOSExtension`mono_profiler_raise_thread_started(tid=2957889536) at profiler-events.h:103:1
    frame #9: 0x003d53da WatchWCSessionAppWatchOSExtension`fire_attach_profiler_events(tid=0xb04dd000) at threads.c:1120:2
    frame #10: 0x003d4d83 WatchWCSessionAppWatchOSExtension`mono_thread_attach(domain=0x801750a0) at threads.c:1547:2
    frame #11: 0x003df1a1 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop_internal(domain=0x801750a0, cookie=0xb04dcc0c, stackdata=0xb04dcba8) at threads.c:6008:3
    frame #12: 0x003df287 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop(domain=0x00000000, dummy=0xb04dcc0c) at threads.c:6045:9
    frame #13: 0x005034b8 WatchWCSessionAppWatchOSExtension`::xamarin_switch_gchandle(self=0x80762c20, to_weak=false) at runtime.m:1805:2
    frame #14: 0x005065c1 WatchWCSessionAppWatchOSExtension`::xamarin_retain_trampoline(self=0x80762c20, sel="retain") at trampolines.m:693:2
    frame #15: 0x657ea520 libobjc.A.dylib`objc_retain + 64
    frame #16: 0x4b4d9caa WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke + 279
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb26e Foundation`____addOperations_block_invoke_4 + 20
    frame #24: 0x65de007b libdispatch.dylib`_dispatch_call_block_and_release + 15
    frame #25: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #26: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #27: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #28: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #29: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #30: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #31: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```

Going further, `info` is about this thread:
```
(lldb) p/x *(int *)(((char *) info->node.key) + 0xa0)
(int) $2 = 0x01243f93
(lldb) thread list
Process 12832 stopped
  thread #1: tid = 0x1243ee1, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_303', queue = 'com.apple.main-thread'
  thread #2: tid = 0x1243ee6, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10
  thread #3: tid = 0x1243ee7, 0x65f81aea libsystem_kernel.dylib`__psynch_cvwait + 10, name = 'SGen worker'
  thread #4: tid = 0x1243ee9, 0x65f7e3d2 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'Finalizer'
  thread #5: tid = 0x1243eea, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10, name = 'Debugger agent'
  thread #6: tid = 0x1243f1d, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.uikit.eventfetch-thread'
  thread #8: tid = 0x1243f93, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
* thread #9: tid = 0x12443a9, 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  thread #10: tid = 0x1244581, 0x65f7fd32 libsystem_kernel.dylib`__workq_kernreturn + 10
(lldb) thread select 8
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
    frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
libsystem_kernel.dylib`mach_msg_trap:
->  0x65f7e396 <+10>: retl
    0x65f7e397 <+11>: nop

libsystem_kernel.dylib`mach_msg_overwrite_trap:
    0x65f7e398 <+0>:  movl   $0xffffffe0, %eax         ; imm = 0xFFFFFFE0
    0x65f7e39d <+5>:  calll  0x65f85f44                ; _sysenter_trap
(lldb) bt
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
  * frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x65f7e8ff libsystem_kernel.dylib`mach_msg + 47
    frame #2: 0x66079679 libxpc.dylib`_xpc_send_serializer + 104
    frame #3: 0x660794da libxpc.dylib`_xpc_pipe_simpleroutine + 80
    frame #4: 0x66079852 libxpc.dylib`xpc_pipe_simpleroutine + 43
    frame #5: 0x66043a8f libsystem_trace.dylib`___os_activity_stream_reflect_block_invoke_2 + 30
    frame #6: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #7: 0x65de3d71 libdispatch.dylib`_dispatch_block_invoke_direct + 257
    frame #8: 0x65de3c62 libdispatch.dylib`dispatch_block_perform + 112
    frame #9: 0x6604349a libsystem_trace.dylib`_os_activity_stream_reflect + 725
    frame #10: 0x6604ef19 libsystem_trace.dylib`_os_log_impl_stream + 468
    frame #11: 0x6604e44d libsystem_trace.dylib`_os_log_impl_flatten_and_send + 6410
    frame #12: 0x6604cb3b libsystem_trace.dylib`_os_log + 137
    frame #13: 0x6604f4aa libsystem_trace.dylib`_os_log_impl + 31
    frame #14: 0x4b4eb4e9 WatchConnectivity`WCSerializePayloadDictionary + 393
    frame #15: 0x4b4d7c4d WatchConnectivity`-[WCSession onqueue_sendResponseDictionary:identifier:] + 195
    frame #16: 0x4b4da435 WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke.411 + 35
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb067 Foundation`____addOperations_block_invoke_2 + 20
    frame #24: 0x65dedf49 libdispatch.dylib`_dispatch_block_async_invoke2 + 77
    frame #25: 0x65de4461 libdispatch.dylib`_dispatch_block_async_invoke_and_release + 17
    frame #26: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #27: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #28: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #29: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #30: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #31: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #32: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```
which is a thread in a "light" detach state (aka. coop detach), where we only unset the domain:
https://github.com/mono/mono/blob/4cefdcb7ce2d939ee78fb45d1b4913eb3bc064fd/mono/metadata/threads.c#L6084-L6111

Fixes mono#17926
lambdageek pushed a commit that referenced this pull request Mar 10, 2021
When the debugger tries to suspend all the VM threads, it can happen with cooperative-suspend that it tries to suspend a thread that has previously been attached, but then did a "light" detach (that only unsets the domain). With the domain set to `NULL`, looking up a JitInfo for a given `ip` will result into a crash, but it isn't even necessarily needed for the purpose of suspending a thread.

More details: Consider the following:
```
(lldb) c
error: Process is running.  Use 'process interrupt' to pause execution.
Process 12832 stopped
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
    frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
   2713         MonoDomain *domain = (MonoDomain *) mono_thread_info_get_suspend_state (info)->unwind_data [MONO_UNWIND_DATA_DOMAIN];
   2714         if (!domain) {
   2715                 /* not attached */
-> 2716                 ji = NULL;
   2717         } else {
   2718                 ji = mono_jit_info_table_find_internal ( domain, MONO_CONTEXT_GET_IP (&mono_thread_info_get_suspend_state (info)->ctx), TRUE, TRUE);
   2719         }
Target 0: (WatchWCSessionAppWatchOSExtension) stopped.
(lldb) bt
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  * frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
    frame #1: 0x004e177b WatchWCSessionAppWatchOSExtension`mono_thread_info_safe_suspend_and_run(id=0xb0767000, interrupt_kernel=0, callback=(WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical at debugger-agent.c:2708), user_data=0xb04dc718) at mono-threads.c:1358:19
    frame #2: 0x00222799 WatchWCSessionAppWatchOSExtension`notify_thread(key=0x03994508, value=0x81378c00, user_data=0x00000000) at debugger-agent.c:2747:2
    frame #3: 0x00355bd8 WatchWCSessionAppWatchOSExtension`mono_g_hash_table_foreach(hash=0x8007b4e0, func=(WatchWCSessionAppWatchOSExtension`notify_thread at debugger-agent.c:2733), user_data=0x00000000) at mono-hash.c:310:4
    frame #4: 0x0021e955 WatchWCSessionAppWatchOSExtension`suspend_vm at debugger-agent.c:2844:3
    frame #5: 0x00225ff0 WatchWCSessionAppWatchOSExtension`process_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0, il_offset=0, ctx=0x00000000, events=0x805b28e0, suspend_policy=2) at debugger-agent.c:4012:3
    frame #6: 0x00227c7c WatchWCSessionAppWatchOSExtension`process_profiler_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0) at debugger-agent.c:4072:2
    frame #7: 0x0021b174 WatchWCSessionAppWatchOSExtension`thread_startup(prof=0x00000000, tid=2957889536) at debugger-agent.c:4149:2
    frame #8: 0x0037912d WatchWCSessionAppWatchOSExtension`mono_profiler_raise_thread_started(tid=2957889536) at profiler-events.h:103:1
    frame #9: 0x003d53da WatchWCSessionAppWatchOSExtension`fire_attach_profiler_events(tid=0xb04dd000) at threads.c:1120:2
    frame #10: 0x003d4d83 WatchWCSessionAppWatchOSExtension`mono_thread_attach(domain=0x801750a0) at threads.c:1547:2
    frame #11: 0x003df1a1 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop_internal(domain=0x801750a0, cookie=0xb04dcc0c, stackdata=0xb04dcba8) at threads.c:6008:3
    frame #12: 0x003df287 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop(domain=0x00000000, dummy=0xb04dcc0c) at threads.c:6045:9
    frame #13: 0x005034b8 WatchWCSessionAppWatchOSExtension`::xamarin_switch_gchandle(self=0x80762c20, to_weak=false) at runtime.m:1805:2
    frame #14: 0x005065c1 WatchWCSessionAppWatchOSExtension`::xamarin_retain_trampoline(self=0x80762c20, sel="retain") at trampolines.m:693:2
    frame #15: 0x657ea520 libobjc.A.dylib`objc_retain + 64
    frame #16: 0x4b4d9caa WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke + 279
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb26e Foundation`____addOperations_block_invoke_4 + 20
    frame #24: 0x65de007b libdispatch.dylib`_dispatch_call_block_and_release + 15
    frame #25: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #26: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #27: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #28: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #29: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #30: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #31: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```

Going further, `info` is about this thread:
```
(lldb) p/x *(int *)(((char *) info->node.key) + 0xa0)
(int) $2 = 0x01243f93
(lldb) thread list
Process 12832 stopped
  thread #1: tid = 0x1243ee1, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_303', queue = 'com.apple.main-thread'
  thread #2: tid = 0x1243ee6, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10
  thread #3: tid = 0x1243ee7, 0x65f81aea libsystem_kernel.dylib`__psynch_cvwait + 10, name = 'SGen worker'
  thread #4: tid = 0x1243ee9, 0x65f7e3d2 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'Finalizer'
  thread #5: tid = 0x1243eea, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10, name = 'Debugger agent'
  thread #6: tid = 0x1243f1d, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.uikit.eventfetch-thread'
  thread #8: tid = 0x1243f93, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
* thread #9: tid = 0x12443a9, 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  thread #10: tid = 0x1244581, 0x65f7fd32 libsystem_kernel.dylib`__workq_kernreturn + 10
(lldb) thread select 8
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
    frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
libsystem_kernel.dylib`mach_msg_trap:
->  0x65f7e396 <+10>: retl
    0x65f7e397 <+11>: nop

libsystem_kernel.dylib`mach_msg_overwrite_trap:
    0x65f7e398 <+0>:  movl   $0xffffffe0, %eax         ; imm = 0xFFFFFFE0
    0x65f7e39d <+5>:  calll  0x65f85f44                ; _sysenter_trap
(lldb) bt
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
  * frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x65f7e8ff libsystem_kernel.dylib`mach_msg + 47
    frame #2: 0x66079679 libxpc.dylib`_xpc_send_serializer + 104
    frame #3: 0x660794da libxpc.dylib`_xpc_pipe_simpleroutine + 80
    frame #4: 0x66079852 libxpc.dylib`xpc_pipe_simpleroutine + 43
    frame #5: 0x66043a8f libsystem_trace.dylib`___os_activity_stream_reflect_block_invoke_2 + 30
    frame #6: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #7: 0x65de3d71 libdispatch.dylib`_dispatch_block_invoke_direct + 257
    frame #8: 0x65de3c62 libdispatch.dylib`dispatch_block_perform + 112
    frame #9: 0x6604349a libsystem_trace.dylib`_os_activity_stream_reflect + 725
    frame #10: 0x6604ef19 libsystem_trace.dylib`_os_log_impl_stream + 468
    frame #11: 0x6604e44d libsystem_trace.dylib`_os_log_impl_flatten_and_send + 6410
    frame #12: 0x6604cb3b libsystem_trace.dylib`_os_log + 137
    frame #13: 0x6604f4aa libsystem_trace.dylib`_os_log_impl + 31
    frame #14: 0x4b4eb4e9 WatchConnectivity`WCSerializePayloadDictionary + 393
    frame #15: 0x4b4d7c4d WatchConnectivity`-[WCSession onqueue_sendResponseDictionary:identifier:] + 195
    frame #16: 0x4b4da435 WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke.411 + 35
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb067 Foundation`____addOperations_block_invoke_2 + 20
    frame #24: 0x65dedf49 libdispatch.dylib`_dispatch_block_async_invoke2 + 77
    frame #25: 0x65de4461 libdispatch.dylib`_dispatch_block_async_invoke_and_release + 17
    frame #26: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #27: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #28: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #29: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #30: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #31: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #32: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```
which is a thread in a "light" detach state (aka. coop detach), where we only unset the domain:
https://github.com/mono/mono/blob/4cefdcb7ce2d939ee78fb45d1b4913eb3bc064fd/mono/metadata/threads.c#L6084-L6111

Fixes mono#17926
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants