-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose ModuleExport
to avoid repeated string lookups
#7828
Conversation
Thanks for this! This is a bit of an interesting question to me though because Wasmtime is in a different position than rustc and is exposed to untrusted modules and this could in theory be an issue for embedders. That being said I think it'd still be good to solve this. One idea would be to expose the underlying indices that Wasmtime uses. For example instead of looking up via name on instances each time you could look up the name on a Would a scheme like that work for your use case? If so would you be interested in adapting this PR to such an approach? |
Yeah that makes sense for embedders. In my case I'm both the producer and consumer of the module (e.g., I'm writing the export names during encoding, and then immediately looking them up from the instance) so DoS isn't important. It would be useful to override the hasher throughout most types in cases like this but I can appreciate not wanting to complicate type signatures too. That scheme sounds ok, but I'm not sure how to expose the index we'd need without mapping bidirectionally between
Ideally I could get the |
Oh the raw value here that |
If we add a function like this to pub fn get_export_index(&self, name: &str) -> Option<EntityIndex> {
let module = self.compiled_module().module();
module.exports.get(name).copied()
} Then on pub fn get_export_by_index(&self, mut store: impl AsContextMut, index: EntityIndex) -> Option<Extern> {
// ... same boilerplate as `get_export` ...
let item = unsafe { Extern::from_wasmtime_export(instance.get_export_by_index(index), store) };
let data = &mut store[self.0];
data.exports[i] = Some(item.clone());
Some(item)
} The part I'm not sure about is setting the lazily populated |
Aha I apologize, I had forgotten about that. That was why the How about this: impl Module {
fn get_export(&self, name: &str) -> Option<ModuleExport>;
}
#[derive(Copy, Clone)]
pub struct ModuleExport {
module: CompiledModuleId, // make sure this isn't mixed up with different modules
entity: EntityIndex, // raw index into the wasm module
export_name_index: usize, // the index within the `IndexMap`
}
impl Instance {
pub fn get_module_export(&self, ..., export: &ModuleExport) -> Option<Export>;
} (please bikeshed the names) Basically package everything up in a new struct in the |
I suppose stepping back a little bit for a moment, one thing I would say is that in Wasmtime we've tried to basically avoid hash maps wherever we can. Where possible we use indexes into lists rather than strings to entries. While hash maps are necessary in the API somewhere it's often possible to lift the hash lookups outside of loops. Or at least that's what's worked out for us so far, not that we can't make the hash maps faster here just that we'd prefer to explore other avenues first (and also because even a faster hash map is probably much slower than a simple bounds check) |
Thanks for clarifying! I updated the PR to match the scheme you described. Is the purpose of including |
461b18b
to
fd3cb7a
Compare
rustc-hash
for module exportsModuleExport
to avoid repeated string lookups
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great thanks! Just a few minor comments but otherwise looks good to go.
Is the purpose of including CompiledModuleId on ModuleExport just out of convenience for user code, or should we compare these somewhere during get_module_export?
I commented below as well, but we'll want to check this when looking up exports to prevent this mistake of using the wrong module's exports to index an instance.
Thank you for the feedback! I updated the PR. I moved the lookup into the shared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
From a Zulip discussion a few months ago (https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Custom.20hashers) it was mentioned that custom hashers like
rustc-hash
should be ok for most usage of hashmaps/indexmaps because they don't need DoS protection.exports
showed up in a recent profile (e.g., when_get_export
doesinstance.module().exports.get_full(name)?;
) so I thought it might be good to start by addingrustc-hash
here. Because this is used inget_export
, this optimization should be useful generally.