-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Migrate from catch_all to catch in the jiterpreter and destroy exceptions #91364
Conversation
Tagging subscribers to this area: @BrzVlad, @kotlarmilos Issue DetailsThis PR migrates from using the 'catch_all' opcode in the jiterpreter to using 'catch' with a specific tag, which should fix #90692. Doing this requires build changes in order to ensure that the c++ exception handling infrastructure is preserved by the linker and exported so it becomes visible to the jiterpreter. I'm still not sure what exact set of changes is necessary because it turns out the way emscripten parses exception-related options is Strange(tm) and we need to have the exact right settings at both compile time and link time for it to work. Not ready for merge yet, but I want to see what breaks on CI.
|
4f3c536
to
21e93e1
Compare
Requesting review because the functionality parts of this are basically done, and I reduced the build changes to the ones that seem necessary to me. I'm still not sure how we should actually do these build configuration changes though, so I hope someone can help figure that out. The situation as I understand it:
|
The linked issue was AOT issue, does this address non-AOT too ? |
It looks like the way Emscripten implements C++ exceptions when wasm exceptions are disabled is by doing a rewrite pass in LLVM that throws a JS exception on a we might be able to emulate this rewrite in the jiterpreter. and possibly mess with the functions in library_exceptions.js to add some mono specific behavior on catches. but i'm not sure how complex that will be |
All the code in question only runs for AOTed applications as far as I know. I don't know what the impact on link time or generated code are yet. |
It looks like this version of __cxa_begin_catch is written in JS, so it's possible we can do something similar to what I'm already doing and it will still work... the trick is figuring out what shape emscripten's JS exceptions have. There's probably a tag property I can check for. The jiterpreter does have JS fallbacks already so it's a matter of customizing them I think. |
Part way through getting this all to work with wasm exception handling disabled. |
Should work now in both native WASM EH mode and JavaScript EH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how we can look at WasmEnableExceptionHandling
in mono.proj
and wasm.proj
src/mono/mono.proj
Outdated
<_AppropriateWasmExceptionFlags Condition="'$(WasmEnableExceptionHandling)' == 'false'">-fexceptions</_AppropriateWasmExceptionFlags> | ||
<_AppropriateWasmExceptionFlags Condition="'$(WasmEnableExceptionHandling)' != 'false'">-fwasm-exceptions</_AppropriateWasmExceptionFlags> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work? Users can set WasmEnableExceptionHandling
to either true or false in their own projects - which gets picked up by WasmApp.targets
. But it can't affect how the runtime is compiled - that happens once in CI and then gets packed up into a nuget.
So we build the runtime with whatever WasmEnableExceptionHandling
value happens to be in effect on CI (I think right now the default is true
, right?). But if a user builds their own project with false
we will still link it with a runtime built with -fexceptions
. Is that reasonable? I wouldn't think so... the runtime's code wouldn't be using the wasm EH instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem is that (as far as i can tell) even if you don't use EH, building with -fexceptions causes the emscripten JS exception handling glue to get linked in, and if you build with -fwasm-exceptions it links in C++ exception handling glue instead. So if we build with a mix of both, we have trouble, aside from the question of "what happens if you put a try/catch in your code".
It's possible we can get away with these flags being wrong as long as the flags are correct at the point of the final link. I'm not really sure, I don't fully understand all these different parts of our build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this, I'm still not confident in it but it seems to work.
This does not appear to fix #90692, but from testing in the debugger that's because Environment.Exit is not throwing a JS exception, so even if I use catch with a tag we still catch whatever is being thrown. The problem appears to be somewhere in the AOT generated code, we've got |
Wouldn't be simpler to use the existing |
That doesn't appear to fix this. The purpose of the jiterp stuff is to eliminate the indirect call per transition, which gives a measurable speedup. With this PR's changes the three modes (mono_llvm_cpp_catch_exception, jiterpreter's generic dispatcher, and jiterpreter's specialized dispatchers) should generate equivalent code from what I know.
I'm not sure what we would replace it with, but yeah. Another option would be to change the interp.c logic so that if it sees no exception data is available, it knows a Special Exception was thrown that is supposed to fully unwind the stack (which would be Environment.Exit or Thread.Abort, I think?) and proceeds to unwind. But the original Special Exception is still gone without a trace because some part of our AOT machinery ate it. We would still have a problem though because libc code expects to be able to |
d9ae2d5
to
95963b9
Compare
Updated to remove the jiterpreter's generic jitcall dispatcher - it's no longer necessary since Wasm EH is the default for the runtime now. It was only useful for JS-based EH. This removes some potential failure cases, should reduce the size of the runtime JS a little bit, and reduces code complexity. |
Does adding a test case make sense? |
I'm not sure how we would write a test for this, since Environment.Exit would kill the test host. Maybe something that does a bunch of hops through JS and C# would work... |
@@ -291,6 +291,8 @@ export interface t_Cwraps { | |||
// returns new size of queue after add | |||
mono_jiterp_tlqueue_add(queue: number, value: VoidPtr): number; | |||
mono_jiterp_tlqueue_clear(queue: number): void; | |||
mono_jiterp_begin_catch(ptr: number): void; | |||
mono_jiterp_end_catch(): void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this always throws, right ? In that case the return type should be never
not void
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't always throw
builder.appendU8(WasmOpcode.catch_all); | ||
builder.appendU8(WasmOpcode.catch_); | ||
builder.appendULeb(builder.getTypeIndex("__cpp_exception")); | ||
builder.callImport("begin_catch"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the JS code in mono_interp_invoke_wasm_jit_call_trampoline
has more branches than this. Could you please explain what is the difference ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside this branch we know wasm EH is enabled, so we by definition will not encounter a JS exception... I think. Even if we did, we couldn't handle it natively using wasm code - we would have to use JS.
invoke_wasm_jit_call_trampoline on the other hand needs to work whether wasm EH is enabled or disabled. It is responsible for handling JS exceptions on behalf of trampolines (which can't handle them).
const result: any = { | ||
c: <any>this.getConstants(), | ||
m: { h: memory }, | ||
}; | ||
if (exceptionTag) | ||
result.x = { e: exceptionTag }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure x
is great name if the exception could ever hit or come from customer code. It may conflict there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these 'x' and 'e' names are wasm import names used by jiterpreter modules
this.appendName("x"); | ||
this.appendName("e"); | ||
// tagtype | ||
this.appendU8(0x04); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make those named constants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are binary constants from the webassembly spec's binary format, the comment above them explains what they are.
I don't understand how this is related to Would dereferencing |
… tag from our wasm module Expose the cxa catch functions to jiterp Import the exception tag type from dotnet.wasm when generating jiterpreter modules Checkpoint Maybe fix build on CI Build flag cleanup Clean up build flags Properly free C++ exceptions in do_jit_call_indirect_js and don't trap non-C++ exceptions Properly clean up exceptions in the fallback trampoline invoke routine (though this shouldn't be necessary) Call begin_catch/end_catch in generated jitcall trampolines Checkpoint: Support for emscripten's JS-based exception handling Disable log messages
95963b9
to
8d67451
Compare
@@ -207,7 +207,7 @@ | |||
<ActualWasiSdkVersion>%(_ActualVersionLines.Identity)</ActualWasiSdkVersion> | |||
<ExpectedWasiSdkVersion>%(_ExpectedVersionLines.Identity)</ExpectedWasiSdkVersion> | |||
</PropertyGroup> | |||
<Error Text="Expected and actual version of WASI SDK does not match. Please delete $(WASI_SDK_PATH) folder to provision a new version." | |||
<Error Text="Expected and actual version of WASI SDK does not match. Please delete $(WASI_SDK_PATH) folder to provision a new version." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
/azp run runtime-wasm |
Azure Pipelines successfully started running 1 pipeline(s). |
Tests were green in the past, but these two failures look like they could be related. I'm not sure how this would cause a timeout though. I'm not sure why there is a WBT failure in the test analysis but it's not showing up in the checks list... |
I can see only one failure, the debugger tests. Just fyi, you might want to run that again because I have seen this failure on debugger tests on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Threading failure is real, looking into it
|
Looks like it's not my fault and the sample was already broken:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmake build bits LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msbuild bits look good
This PR migrates from using the
catch_all
opcode in the jiterpreter to usingcatch
with a specific tag, which partially fixes #90692.Doing this requires build changes in order to ensure that the c++ exception handling infrastructure is preserved by the linker and exported so it becomes visible to the jiterpreter. In addition to the transition from
catch_all
tocatch
we now engage the c++ exception handling infrastructure to free thrown exceptions, since previously they would have leaked.This PR also removes the jiterpreter's do_jit_call/jit_call_cb dispatcher - we no longer need it for runtime builds using Wasm EH (the code clang generates in that mode has equivalent or better performance).
The rest of the fix for 90692 is being handled in another PR.