Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazily populate a store's trampoline map #3742

Merged

Conversation

alexcrichton
Copy link
Member

This commit is another installment of "how fast can we make
instantiation". Currently when instantiating a module with many function
imports each function, typically from the host, is inserted into the
store. This insertion process stores the VMTrampoline for the host
function in a side table so it can be looked up later if the host
function is called through the Func interface. This insertion process,
however, involves a hash map insertion which can be relatively expensive
at the scale of the rest of the instantiation process.

The optimization implemented in this commit is to avoid inserting
trampolines into the store at Func-insertion-time (aka instantiation
time) and instead only lazily populate the map of trampolines when
needed. The theory behind this is that almost all Func instances that
are called indirectly from the host are actually wasm functions, not
host-defined functions. This means that they already don't need to go
through the map of host trampolines and can instead be looked up from
the module they're defined in. With the assumed rarity of host functions
making lookup_trampoline a bit slower seems ok.

The lookup_trampoline function will now, on a miss from the wasm
modules and host_trampolines map, lazily iterate over the functions
within the store and insert trampolines into the host_trampolines map.
This process will eventually reach something which matches the function
provided because it should at least hit the same host function. The
relevant lookup_trampoline now sports a new documentation block
explaining all this as well for future readers.

Concretely this commit speeds up instantiation of an empty module with
100 imports and ~80 unique signatures from 10.6us to 6.4us, a 40%
improvement.

This commit is another installment of "how fast can we make
instantiation". Currently when instantiating a module with many function
imports each function, typically from the host, is inserted into the
store. This insertion process stores the `VMTrampoline` for the host
function in a side table so it can be looked up later if the host
function is called through the `Func` interface. This insertion process,
however, involves a hash map insertion which can be relatively expensive
at the scale of the rest of the instantiation process.

The optimization implemented in this commit is to avoid inserting
trampolines into the store at `Func`-insertion-time (aka instantiation
time) and instead only lazily populate the map of trampolines when
needed. The theory behind this is that almost all `Func` instances that
are called indirectly from the host are actually wasm functions, not
host-defined functions. This means that they already don't need to go
through the map of host trampolines and can instead be looked up from
the module they're defined in. With the assumed rarity of host functions
making `lookup_trampoline` a bit slower seems ok.

The `lookup_trampoline` function will now, on a miss from the wasm
modules and `host_trampolines` map, lazily iterate over the functions
within the store and insert trampolines into the `host_trampolines` map.
This process will eventually reach something which matches the function
provided because it should at least hit the same host function. The
relevant `lookup_trampoline` now sports a new documentation block
explaining all this as well for future readers.

Concretely this commit speeds up instantiation of an empty module with
100 imports and ~80 unique signatures from 10.6us to 6.4us, a 40%
improvement.
@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Jan 28, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed comments! Very helpful. I nitpicked a bunch, but I hope my nitpicks should collectively make reading this documentation even more helpful for future spelunkers.

crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved
.skip(self.host_func_trampolines_registered)
{
self.host_func_trampolines_registered += 1;
self.host_trampolines.insert(f.sig_index(), f.trampoline());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.host_trampolines.insert(f.sig_index(), f.trampoline());
let old_entry = self.host_trampolines.insert(f.sig_index(), f.trampoline());
debug_assert!(old_entry.is_none());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah actually doing this causes a test failure because we might find duplicates of signatures we're not looking for as the array of functions is scanned.

@alexcrichton alexcrichton merged commit c839685 into bytecodealliance:main Feb 2, 2022
@alexcrichton alexcrichton deleted the no-insert-hashmap-instantiate branch February 2, 2022 15:43
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Feb 2, 2022
* Lazily populate a store's trampoline map

This commit is another installment of "how fast can we make
instantiation". Currently when instantiating a module with many function
imports each function, typically from the host, is inserted into the
store. This insertion process stores the `VMTrampoline` for the host
function in a side table so it can be looked up later if the host
function is called through the `Func` interface. This insertion process,
however, involves a hash map insertion which can be relatively expensive
at the scale of the rest of the instantiation process.

The optimization implemented in this commit is to avoid inserting
trampolines into the store at `Func`-insertion-time (aka instantiation
time) and instead only lazily populate the map of trampolines when
needed. The theory behind this is that almost all `Func` instances that
are called indirectly from the host are actually wasm functions, not
host-defined functions. This means that they already don't need to go
through the map of host trampolines and can instead be looked up from
the module they're defined in. With the assumed rarity of host functions
making `lookup_trampoline` a bit slower seems ok.

The `lookup_trampoline` function will now, on a miss from the wasm
modules and `host_trampolines` map, lazily iterate over the functions
within the store and insert trampolines into the `host_trampolines` map.
This process will eventually reach something which matches the function
provided because it should at least hit the same host function. The
relevant `lookup_trampoline` now sports a new documentation block
explaining all this as well for future readers.

Concretely this commit speeds up instantiation of an empty module with
100 imports and ~80 unique signatures from 10.6us to 6.4us, a 40%
improvement.

* Review comments

* Remove debug assert
mpardesh pushed a commit to avanhatt/wasmtime that referenced this pull request Mar 17, 2022
* Lazily populate a store's trampoline map

This commit is another installment of "how fast can we make
instantiation". Currently when instantiating a module with many function
imports each function, typically from the host, is inserted into the
store. This insertion process stores the `VMTrampoline` for the host
function in a side table so it can be looked up later if the host
function is called through the `Func` interface. This insertion process,
however, involves a hash map insertion which can be relatively expensive
at the scale of the rest of the instantiation process.

The optimization implemented in this commit is to avoid inserting
trampolines into the store at `Func`-insertion-time (aka instantiation
time) and instead only lazily populate the map of trampolines when
needed. The theory behind this is that almost all `Func` instances that
are called indirectly from the host are actually wasm functions, not
host-defined functions. This means that they already don't need to go
through the map of host trampolines and can instead be looked up from
the module they're defined in. With the assumed rarity of host functions
making `lookup_trampoline` a bit slower seems ok.

The `lookup_trampoline` function will now, on a miss from the wasm
modules and `host_trampolines` map, lazily iterate over the functions
within the store and insert trampolines into the `host_trampolines` map.
This process will eventually reach something which matches the function
provided because it should at least hit the same host function. The
relevant `lookup_trampoline` now sports a new documentation block
explaining all this as well for future readers.

Concretely this commit speeds up instantiation of an empty module with
100 imports and ~80 unique signatures from 10.6us to 6.4us, a 40%
improvement.

* Review comments

* Remove debug assert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants