Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for utf16+latin1 in the component model #4309

Closed
Tracked by #4185
alexcrichton opened this issue Jun 23, 2022 · 0 comments · Fixed by #4623
Closed
Tracked by #4185

Implement support for utf16+latin1 in the component model #4309

alexcrichton opened this issue Jun 23, 2022 · 0 comments · Fixed by #4623
Labels
wasm-proposal:component-model Issues related to the WebAssembly Component Model proposal

Comments

@alexcrichton
Copy link
Member

Currently support for utf16+latin1 is not implemented in Wasmtime, but we'll need to finish this and test it before the component model is considered done.

In general I'd expect that this would use the encoding_rs crate for the internal details of latin1 to avoid open-coding that in Wasmtime itself.

Lowering

Lowering a string into wasm is currently unimplemented. I think that this is required to implement the store_string_to_latin1_or_utf16 function in the canonical ABI explainer. My current understanding is that even if we could implement something more optimal in Rust we can't do that because the semantics of lowering are already specified.

I believe the pseudo-code there does most of the fiddly bits but some small helpers in encoding_rs are probably going to be required.

Lifting

Calculation of the byte length and actually getting the string are unimplemented. I think that we're free to use encoding_rs here however we see fit. Probably the decode_latin1 function will be useful here.

Other notes

I am personally unfamilar with latin1 as an encoding. I don't know if an arbitrary list of types are guaranteed to be valid latin1 or not. (the infallibility of decode_latin1 seems odd to me).

Using encoding_rs may be a better option for utf16 decoding we currently do (and maybe even utf8 since encoding_rs can probably do simd things that the standard library can't). If someone's intrepid it might be interesting to try to benchmark this and see if it's beneficial to use encoding_rs for almost everything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasm-proposal:component-model Issues related to the WebAssembly Component Model proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant