Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for async call hooks #3876

Merged
merged 12 commits into from
Mar 23, 2022

Conversation

acw
Copy link
Contributor

@acw acw commented Mar 2, 2022

The present runtime supports the ability to add aspect-like hooks to wasm entrance/exit via the call_hook method. In some cases, it can be useful to have this hook do asynchronous operations; to acquire a lock, for example.

This patch series provides an alternate way for users to provide a call hook that mirrors the resource limiter callbacks. To get around some trickiness with Rust types, closures, and Futures, call_hook_async takes an object on which a simple trait is implemented. This hook is then executed using the existing block_on mechanism.

Beyond just adding this capability, this patch makes two other changes to the code base to avoid an error condition in which one of these tasks becomes scheduled right as an async fiber is being dropped. The first changes the async_cx method so that it returns Option<AsyncCx> instead of just AsyncCx; this can be an early detection mechanism for the fiber being dropped. The more key one is a late check in block_on, which turns a former-panic into being a catchable an error condition.

Many thanks to @alexcrichton, who figured out the "resuming a task on a dying thread" issue.

@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Mar 3, 2022
@github-actions
Copy link

github-actions bot commented Mar 3, 2022

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks! I mostly just commented below how lots of fallible error handling I think can be asserting instead since it's indicative of a bug in Wasmtime if we hit the None case for the AsyncCx in most locations (it's just the one on CallHook that matters).

Additionally though can you be sure to add some tests for this, also specifically for a case such as where a hook is registered and the async invocation is interrupted?

crates/wasmtime/src/func.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/func.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/linker.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/linker.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
@@ -1251,11 +1310,15 @@ impl StoreOpaque {

#[cfg(feature = "async")]
#[inline]
pub fn async_cx(&self) -> AsyncCx {
pub fn async_cx(&self) -> Option<AsyncCx> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some documentation to this function as to why Option is the return value? (notably pointing out as well that this panics if async_support is not enabled, whereas the signature might naively appear to return None if async support is not enabled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to just make the check be for both conditions (i.e., we have async_support and we're not being torn down), rather than panicking if async_support is not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I've added the basic docs, with the panic sill allowed, in 09d62a4)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that the decision about whether to do something async or not is a higher-level concern than this function, so something shouldn't unconditionally call this and then act on the return value, instead it should have already made some prior decision to call this function and then it acts on the return value as necessary. (plus having two different None states would mean we'd probably have to return some sort of custom enum which is a bit of a pain)

crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
pub trait CallHookHandler<T>: Send {
/// A callback to run when wasmtime is about to enter a host call, or when about to
/// exit the hostcall.
async fn handle_call_event(&self, t: &mut T, ch: CallHook) -> Result<(), crate::Trap>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One worry here: async_trait means this will be using a Box under the hood, which means we incur an allocation for every hook invocation, i.e. two times per hostcall.

A different factoring might allow us to allocate once up front (for the hook itself), basically by making this a poll-style function rather than an async fn.

How worried are we about these costs? AIUI we care a lot about host/guest boundary crossing being cheap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's actually a good point! I haven't ever done benchmarking myself about the costs with call hooks enabled, but I suspect that even as-is it's already at least an order of magnitude slower than "raw" syscalls. That being said though even when wiggle is layered on top I suspect it's even slower as we haven't done much work to optimize wiggle bits (I know there's a Mutex somewhere right now frobbed on all entrances and exits, but the mutex is always un-contended). In that sense it might be good to benchmark this locally in something like Viceroy perhaps and see if the overhead is meaningful?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Seems like it'd be interesting data to have around -- comparing no call hook, sync call hook, and async call hook.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To assist in poking around with performance bits I wrote up #3883 which should be a good spot we can insert an async call hook and see the performance. My expectation is that given how slow most async things already are an extra allocation here isn't going to really be all that expensive. For comparison calling into wasm itself is on the order of microseconds instead of nanoseconds because of how our stack management currently works. Calling into the host already allocates a Box'd future per call so allocating 2 more probably isn't the end of the world.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the benchmark added there the host-calling-wasm case has negligible overhead with this design of async hooks because the cost of that transition is on the order of a microsecond where the allocations here are more like nanoseconds. I think setting up the stack and unconditionally doing a context switch is probably the expensive part.

For wasm-calling-host this imposes a ~40ns overhead. If the host function is defined via Func::wrap ("typed" and "synchronous") then the sync hook vs async hook is 7ns vs 46ns. For functiosn defined via Func::wrapN_async ("typed" and async) then the sync hook vs async hook is 29ns vs 69ns.

Now whether those specific numbers make sense in a broader context I'm not sure, but I would suspect that most of the cost is indeed the allocation of the returned future from this trait.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Mar 4, 2022
The goal of this new benchmark, `call`, is to help us measure the
overhead of both calling into WebAssembly from the host as well as
calling the host from WebAssembly. There's lots of various ways to
measure this so this benchmark is a bit large but should hopefully be
pretty thorough. It's expected that this benchmark will rarely be run in
its entirety but rather only a subset of the benchmarks will be run at
any one given time.

Some metrics measured here are:

* Typed vs Untyped vs Unchecked - testing the cost of both calling wasm
  with these various methods as well as having wasm call the host where
  the host function is defined with these various methods.

* With and without `call_hook` - helps to measure the overhead of the
  `Store::call_hook` API.

* Synchronous and Asynchronous - measures the overhead of calling into
  WebAssembly asynchronously (with and without the pooling allocator) in
  addition to defining host APIs in various methods when wasm is called
  asynchronously.

Currently all the numbers are as expected, notably:

* Host calling WebAssembly is ~25ns of overhead
* WebAssembly calling the host is ~3ns of overhead
* "Unchecked" is a bit slower than "typed", and "Untyped" is slower than
  unchecked.
* Asynchronous wasm calling a synchronous host function has ~3ns of
  overhead (nothing more than usual).
* Asynchronous calls are much slower, on the order of 2-3us due to
  `madvise`.

Lots of other fiddly bits that can be measured here, but this will
hopefully help establish a benchmark through which we can measure in the
future in addition to measuring changes such as bytecodealliance#3876
alexcrichton added a commit that referenced this pull request Mar 4, 2022
The goal of this new benchmark, `call`, is to help us measure the
overhead of both calling into WebAssembly from the host as well as
calling the host from WebAssembly. There's lots of various ways to
measure this so this benchmark is a bit large but should hopefully be
pretty thorough. It's expected that this benchmark will rarely be run in
its entirety but rather only a subset of the benchmarks will be run at
any one given time.

Some metrics measured here are:

* Typed vs Untyped vs Unchecked - testing the cost of both calling wasm
  with these various methods as well as having wasm call the host where
  the host function is defined with these various methods.

* With and without `call_hook` - helps to measure the overhead of the
  `Store::call_hook` API.

* Synchronous and Asynchronous - measures the overhead of calling into
  WebAssembly asynchronously (with and without the pooling allocator) in
  addition to defining host APIs in various methods when wasm is called
  asynchronously.

Currently all the numbers are as expected, notably:

* Host calling WebAssembly is ~25ns of overhead
* WebAssembly calling the host is ~3ns of overhead
* "Unchecked" is a bit slower than "typed", and "Untyped" is slower than
  unchecked.
* Asynchronous wasm calling a synchronous host function has ~3ns of
  overhead (nothing more than usual).
* Asynchronous calls are much slower, on the order of 2-3us due to
  `madvise`.

Lots of other fiddly bits that can be measured here, but this will
hopefully help establish a benchmark through which we can measure in the
future in addition to measuring changes such as #3876
tests/all/call_hook.rs Outdated Show resolved Hide resolved
tests/all/call_hook.rs Outdated Show resolved Hide resolved
tests/all/call_hook.rs Outdated Show resolved Hide resolved
acw added 11 commits March 22, 2022 17:43
…e on a dying fiber.

This situation should never occur in the existing code base, but can be
triggered if support for running outside async code in a call hook.
…dying.

This should never happen in the existing code base, but is a nice
forward-looking guard. The current implementations simply lift the
trap that would eventually be produced by such an operation into
a `Trap` (or similar) at the invocation of `async_cx()`.
This retains the ability to do non-async hooks. Hooks end up being
implemented as an async trait with a handler call, to get around some
issues passing around async closures. This change requires some of
the prior changes to handle picking up blocked tasks during fiber
shutdown, to avoid some panics during timeouts and other such events.
…em into expect()/unwrap().

The justification for this revert is that (a) these events shouldn't
happen, and (b) they wouldn't be catchable by wasm anyways.
…heck.

This also moves the checks inside of their respective Async variants,
meaning that if you're using an async-enabled version of wasmtime but
using the synchronous versions of the callbacks, you won't pay any
penalty for validating the async context.
In the prior version, we only checked that the box had not been cleared,
but had not ensured that there was an actual context for us to use. This
updates the check to validate both, returning None if the inner context
is missing. This allows us to skip a validation check inside `block_on`,
since all callers will have run through the `async_cx` check prior to
arrival.
@acw acw force-pushed the acw/call-hook-async-variant branch from 207f2e6 to 48e7f6e Compare March 23, 2022 00:45
@acw
Copy link
Contributor Author

acw commented Mar 23, 2022

Apologies for the rebase 😞, but this should be good to go.

The one thing I haven't figured out is a test case that triggers async_cx() returning None due to a drop on the relevant FiberFuture. I can get this to happen as part of executing a post-call hook in a much larger system, but am having trouble reducing it to a test. If you have any ideas, I'd love to hear them.

Should help exercise that the check for `None` is properly handled in a
few more locations.
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! I pushed up what I was thinking for a test which looks like it panics if the one case of .ok_or() switches to a unwrap() which is mainly what I wanted to make sure we tested.

@alexcrichton alexcrichton merged commit 6a60e83 into bytecodealliance:main Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants