Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalizing how wit-bindgen will be used #214

Closed
alexcrichton opened this issue May 3, 2022 · 6 comments
Closed

Finalizing how wit-bindgen will be used #214

alexcrichton opened this issue May 3, 2022 · 6 comments
Labels
open question Big topics that have no clear answer at this time

Comments

@alexcrichton
Copy link
Member

This repository has served, up until now, as a sort of testing ground for features that the component-model has and what it might look like to have *.wit interfaces to work between guest wasm modules and the host. The original intention was a sort of polyfill for the component model before it was finished. Over time though more pieces have been brought into wit-bindgen and it's continued to evolve. To me now it's clear that the current state as-is is not going to be viable forever within the component model and we'll need to change things. I'm opening this issue to track some of these items and write down my thoughts on this.

To some degree wit-bindgen is not operating at the correct level for the component model. Generally wit-bindgen deals with *.wit files which are interfaces, but the component model typically has things like intrinsics in order to support interfaces such as allocation functions, future/handle-related functions, etc. These intrinsics are not captured well within wit-bindgen and while there's runtime support included in the Rust-to-wasm library for example it doesn't work well when the host gets involved. The main problem here is that for the wasmtime generator, or other host generators, hosts often want to consume a "shape" of a module (one with a particular interface) rather than a particular module itself (unlike the JS generator where the web often wants to consume a single wasm module). This means that for the wasmtime host we don't actually know the set of intrinsics to provide a core wasm module given the shape of generated code, which is trying to provide a Linker which can be used to instantiate any core wasm implementing the canonical ABI.

There are other implementation details that wit-bindgen also doesn't account for such as pointer size and string encoding. Right now wit-bindgen pervasively assumes utf-8 and 32-bit, but these are simply implementation details of an interface that a true component can encapsulate.

Overall this all adds up to the fact that specifically for hosts which want to consume any component that adheres to an interface wit-bindgen's model and design are not sufficient. There's no way to easily slot in an actual component and use that within the current design.

This leads to a list of items which I think need to be figured out before wit-bindgen is considered "finalized":

  • Generators for "hosts" such as wasmtime, js, wasmtime-py, etc, probably need either significant reworkings or entire rewrites. These hosts can be split into a few different use cases:
    • The wasmtime host wants to consume any component implementing an interface, which means that the actual input needs to be a component, not a core wasm module. This means that the component model needs to be natively implemented within Wasmtime itself and then wit-bindgen would generate code using that component model support. This would mean that all the intrinsics and canonical ABI translation that wit-bindgen does today would entirely go away since it would be the reponsibility of Wasmtime itself.
    • The js host needs to be refactored to take a component as input, not *.wit files themselves. JS bindings to run a module cannot be generated without the component itself because the generated bindings are specific to ABI details such as string encoding, pointer size, intrinsics used, etc. Given any particular set we know how to implement it but fundamentally a *.wit file is not enough.
    • The wasmtime-py generator is implemented with the C API for Wasmtime. In the long-run the C API for Wasmtime should support the component model iteslf which means the wasmtime-py generator would probably look more like the wasmtime generator where it doesn't implement the canonical ABI in the generated code, rather simply working with types. In the meantime though if it should be kept working then this needs to look more like the JS host where a component is taken as input which fixes ABI details such as intrinsics needed, string encoding, pointer size, etc.
    • The spidermonkey generator is going to need significant refactorings to use the component model rather than the now-outdated module linking proposal. It's not clear to me how intrinsics for things like futures and handles will work and I think this needs a significant amount of work to be fully implemented.
  • Overall the wit-bindgen CLI generating readable code is probably going to go away or needs to be significantly refactored. While it's useful for exploring and seeing what code is generated it's only a part of the picture and is missing significant pieces. For example the js side needs a component as input whereas the wasmtime side does not. The compiled-to-wasm Rust and C generators assume runtime support that's not necessarily obvious.
  • For compiled-to-wasm uses of wit-bindgen I think this project is in a pretty good place, but I still think there's a "last mile" that needs to be figured out. Notably for handles and async functions support will be needed for various intrinsics that are monomorphized. For example there will likely be an intrinsic along the lines of "register interest for this future with this type" and that's translated to some import at the core wasm layer but right now there's no mechanism to transmit to some other tool that the core wasm import is intended to be a specific component intrinsic. This is basically saying that the cargo component story needs to be sure to account for all the intrinsics coming down the pike with resources and async when it bundles up the output from rustc into an actual component. Ideally this also needs to be at least somewhat applicable to C as well.
  • Overall there's a laundry-list of other semi-unrelated issues to the design of wit-bindgen which also need to be figured out before to long:
    • Importing resources from other modules needs to be better designed and fleshed out. The current use system in *.wit files is very primitive and not really reflected in the generated code, meaning that it's almost always buggy. One example is that wit-bindgen has no real way in its generated code to day to connect and imported resources to an exported resource and have them be the same thing. Those two halves are entirely separate right now and need to be unified. More generally using use right now typically results in a lot of bugs.
    • Async support, including streams, needs to be fleshed out more in the component model itself. All of the support landed in-repo right now is probably going to be deleted since it's the wrong shape and even the support in Implement async witx functions for a Wasmtime host #82 is not sufficient and will need redesign.
    • In general there's some open syntax question in the issues which should be settled.

I also don't think that this is necessarily a complete list. Naturally wit-bindgen is an ever-evolving project as the component model is an ever-evolving standard. At a high-level the goal of wit-bindgen is to provide a natural place for code generators targeting the component model to live, but the precise goals of the past may not align well with what it needs to do today, so more significant refactorings may always be needed if I'm missing something here. These I think cover the high points though of what I see needs to change in the near-ish future.

@alexcrichton alexcrichton added the open question Big topics that have no clear answer at this time label May 3, 2022
@willemneal
Copy link
Contributor

Thank you for the update and all the recent PRs! My one question concerns the stability of the .wit syntax. A recent PR updated the float type literal. How close to adding semver for the wit syntax?

@alexcrichton
Copy link
Member Author

I don't believe the current syntax is "1.0" in the sense of it's final. I don't personally have a great sense in how close it is to a "1.0 stable", but this will likely be entangled in the process of migrating WASI to the component model and the *.wit format for WASI proposals.

@willemneal
Copy link
Contributor

It doesn't need to be "1.0 stable", having some version of semver will allow third party tooling that consumes wit already in development to handle breaking changes. Most likely some preprocessor for older versions.

@Mossaka
Copy link
Member

Mossaka commented May 10, 2022

I don't have insights into how generators for host and guest languages would work. I am just wondering if there is a plan or ideas to make supporting a new programming language that understands *.wit interfaces easier.

Currently there are two languages for generating wasm binaries that use interface types - Rust and C. With the proposal described here, will it simplify the work to support other languages, such as Go, Java, or C# etc.?

@alexcrichton
Copy link
Member Author

The intention is that the component model is interoperable with a wide variety of programming languages so at the fundamental layer it's definitely not planned to simply only use Rust & C. Practically though code generators take time to create and are a big investment, so there aren't any plans by me at this time to expand the set of language support but that may change over time of course. Others are of course able to make bindings as well and the goal of this repository is to make it at least a somewhat smoother experience to support a new language instead of starting from scratch each time.

@alexcrichton
Copy link
Member Author

Ok I've written up more concrete plans and thoughts over at #314. I'm going to close this as it's otherwise a relatively vague issue and is intended to be supplanted by #314

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open question Big topics that have no clear answer at this time
Projects
None yet
Development

No branches or pull requests

3 participants