Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional performance improvements for module instantiation. #2842

Merged
merged 8 commits into from
Apr 16, 2021

Conversation

peterhuene
Copy link
Member

This PR contains a number of internal refactorings that help make Store::register_module do as little work as possible (registration is now only an insertion into a BTreeMap).

This will improve module instantiation performance for modules containing lots of functions and instructions.

The changes are:

  • Move the shared signature registry out of Store and into Engine. Modules now register their signatures with the registry upon creation and store the signature index -> shared index mappings internally.
  • Engines now register the signatures of any host functions on Config upon engine creation as well as store the anyfunc representations of the host functions (previously kept in Store).
  • Removed StoreFrameInfo and StackMapRegistry in favor of a unified ModuleRegistry per-store. The module registry can be queried based on program counter values to quickly locate the owning module.
  • With the removal of StackMapRegistry, the runtime is now given a trait object to call back on for finding the stack map for a program counter value during GC. The lookup uses the existing data in CompiledModule which is already optimized for the search.

This PR supersedes #2820.

@peterhuene
Copy link
Member Author

peterhuene commented Apr 14, 2021

@alexcrichton Hopefully this didn't step on the work you were in the middle of for the feedback on #2820 (🤞 that you hadn't started work on it before going on vacation). As I needed to refactor StoreFrameInfo into a more general "module registry" for the signature registry changes anyway, it was pretty simple to remove StackMapRegistry entirely at that point.

@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Apr 15, 2021
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all! I think it makes sense all as one go as well and I also hadn't started working on it yet :)

I left a bunch of comments below, but my main thinking is that I'm a little uncomfortable with the dynamic lifetimes of a VMSharedSignatureIndex now. I only realized halfway through that you were careful to not store VMTrampoline pointers globally (which makes sense), but this is definitely a radically different ownership model than what we had before where "everything lives as long as Store".

I don't see anything wrong here but I'm hoping that we can finagle the APIs a bit and/or use registration "tokens" or something like that to better document-via-code the ownership story for shared signature indices.

crates/jit/src/instantiate.rs Show resolved Hide resolved
crates/runtime/src/externref.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/engine.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/engine.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/signatures.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/module/registry.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/module/registry.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/module/registry.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/module/serialization.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/signatures.rs Show resolved Hide resolved
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @peterhuene!

crates/wasmtime/src/engine.rs Show resolved Hide resolved
crates/wasmtime/src/engine.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/engine.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/engine.rs Outdated Show resolved Hide resolved
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah you're right about the filter_map and it not being usable. That being said it still makes me feel uncomfortable. What would you think of something like this:

trait ModuleLookup {
    fn lookup_module(&self, pc: usize) -> Option<Arc<dyn StackMapLookup>>;
}

trait StackMapLookup {
    fn lookup_func(&self, pc: usize) -> Option<&StackMap>;
}

The two-level indirection is kinda ugly but I think this should work. We'd store *const dyn ModuleLookup inside of the VMContext and looking up a stack map would be a two-level process. The Arc returned from the first one would be Arc<CompiledModule> under the hood.

We could even go further and do:

trait ModuleLookup {
    fn lookup_module(&self, pc: usize) -> Option<Arc<dyn StackMapLookup>>;
}

trait StackMapLookup {
    fn lookup_func(&self, pc: usize) -> Option<&[StackMapInformation]>;
}

which would leave the binary-search to the externref.rs module instead of having it sort of out-of-the-place in registry.rs

crates/wasmtime/src/trampoline.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/func/typed.rs Outdated Show resolved Hide resolved
crates/wasmtime/src/values.rs Outdated Show resolved Hide resolved
@peterhuene
Copy link
Member Author

peterhuene commented Apr 16, 2021

That push of the latest commit broke the build in a really strange way (as if the CI wasn't building all the changes) 😕.

Oh, upstream just added a new call to initialize trap handling, noticed it on a rebase (weird that GitHub is reporting no merge conflicts though as I had one on the rebase). Will fix.

This commit moves the shared signature registry out of `Store` and into
`Engine`.

This helps eliminate work that was performed whenever a `Module` was
instantiated into a `Store`.

Now a `Module` is registered with the shared signature registry upon creation,
storing the mapping from the module's signature index space to the shared index
space.

This also refactors the "frame info" registry into a general purpose "module
registry" that is used to look up trap information, signature information, and
(soon) stack map information.
This commit removes the stack map registry and instead uses the existing
information from the store's module registry to lookup stack maps.

A trait is now used to pass the lookup context to the runtime, implemented by
`Store` to do the lookup.

With this change, module registration in `Store` is now entirely limited to
inserting the module into the module registry.
* Make `FunctionInfo` public and `CompiledModule::func_info` return it.
* Make the `StackMapLookup` trait unsafe.
* Add comments for the purpose of `EngineHostFuncs`.
* Rework ownership model of shared signatures: `SignatureCollection` in
  conjunction with `SignatureRegistry` is now used so that the `Engine`,
  `Store`, and `Module` don't need to worry about unregistering shared
  signatures.
* Implement `Func::param_arity` and `Func::result_arity` in terms of
  `Func::ty`.
* Make looking up a trampoline with the module registry more efficient by doing
  a binary search on the function's starting PC value for the owning module and
  then looking up the trampoline with only that module.
* Remove reference to the shared signatures from `GlobalRegisteredModule`.
@peterhuene
Copy link
Member Author

Wouldn't the two-stage lookup suffer the same problem as we're borrowing from an interior reference of a RefCell optionally, regardless of returning the module to look up the stack map for or the stack map itself?

@peterhuene
Copy link
Member Author

Oh I guess if it's Arc that's fine (no more borrow); I see how to implement this, so I'll make that change.

This commit uses a two-phase lookup of stack map information from modules
rather than giving back raw pointers to stack maps.

First the runtime looks up information about a module from a pc value, which
returns an `Arc` it keeps a reference on while completing the stack map lookup.

Second it then queries the module information for the stack map from a pc
value, getting a reference to the stack map (which is now safe because of the
`Arc` held by the runtime).
This file hasn't been used for a while and was mistakenly not deleted.
@peterhuene
Copy link
Member Author

@alexcrichton all feedback should now be addressed.

This commit adds `Module::from_parts` as an internal constructor that shared
the implementation between `Module::from_binary` and module deserialization.
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎊

@peterhuene peterhuene merged commit 6b6a646 into bytecodealliance:main Apr 16, 2021
@peterhuene peterhuene deleted the engine-sig-registry branch April 16, 2021 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants