Refresh, rescope and rename #21

lukewagner · 2019-03-22T22:41:23Z

This PR updates the Host Bindings proposal as discussed at TPAC and after a bunch of follow-up discussions.

The PR has an overview that I won't restate but one additional proposed change that I'd like to make (if we accept this PR) is to rename this repo from "host-bindings" to "webidl-bindings", to reflect the new scope of the proposal (for reasons given in the FAQ of the PR).

To make sure we have proper visibility and discussion, I'll add an agenda item to the April 2 CG meeting for discussion.

lukewagner · 2019-03-22T22:51:31Z

(Accidentally posted before filling out the PR, somehow. Updated with title/description.)

jgravelle-google

Looks good to me. Wait until after the CG meeting to merge?

littledan

This looks great! I'm a big fan of the design decisions to not include all of WebIDL and just start with some of the most important types (especially given @Ms2ger's ongoing work to remove unused legacy features from WebIDL) and to fall back to JS conversions when the WebIDL types don't match (given how, sometimes, specifications are refactored over time to have different IDL).

I'm looking forward to all of this proposal moving forward and becoming even more concrete. I can imagine many additional types of conversions making sense to add in conjunction with the GC proposal (e.g., passing strings in and out without copying).

lukewagner · 2019-03-26T17:10:11Z

Great to hear, thanks for the feedback. I'll hold off merging until a chance to discuss in the next CG meeting.

binji

Agreed, this looks really good. It's much easier now to see how this all fits together, and where we can add more functionality to provide more expressive bindings. It'd be great to see more example bindings, including cases that aren't representable with the current binding expressions.

proposals/webidl-bindings/Explainer.md

annevk · 2019-03-29T10:06:23Z

Very exciting!

(given how, sometimes, specifications are refactored over time to have different IDL).

This is probably worth calling out explicitly as this is indeed somewhat common, including changing long to double and such). I take it this works for both input and return values? Really cool tactic, though I guess that means there will be a little bit more overhead.

Am I correct that the intention here is to always handle DOMString via a UTF-8 buffer? What happens if the DOMString contains lone surrogates?

It might be worth considering to map ByteString to a buffer (which I'm assuming is a byte sequence). The type is primarily used for byte sequences that happen to look a lot like strings and can be manipulated as such to some extent (e.g., HTTP header values). I guess the question is to what extent that distinction is meaningful for Wasm. Concretely, if a header value is 0xFF, do you want to see that as 0xFF or U+00FF expressed as two UTF-8 bytes?

lukewagner · 2019-03-29T20:49:31Z

@annevk Thanks for the comments; PTAL to see if they were addressed correctly.

This is probably worth calling out explicitly as this is indeed somewhat common, including changing long to double and such).

I added a third bullet to "Question 2" in the "JS API Integration" section.

I take it this works for both input and return values? Really cool tactic, though I guess that means there will be a little bit more overhead.

Yes, input and return values (both being part of the "Web IDL Signature" which can mismatch. This is an instantiation-time check that only takes the slower, roundabout path on mismatch, so hopefully no overhead in the common case.

Am I correct that the intention here is to always handle DOMString via a UTF-8 buffer? What happens if the DOMString contains lone surrogates?

Ah, good question; I glossed over this. So I think this only comes up for the alloc‑utf8‑str incoming binding operator. My default assumption is that this case should trap (in general there are a number of cases where binding operators can trap). I'll add a note to alloc-utf8-str.

It might be worth considering to map ByteString to a buffer (which I'm assuming is a byte sequence). The type is primarily used for byte sequences that happen to look a lot like strings and can be manipulated as such to some extent (e.g., HTTP header values). I guess the question is to what extent that distinction is meaningful for Wasm. Concretely, if a header value is 0xFF, do you want to see that as 0xFF or U+00FF expressed as two UTF-8 bytes?

Good point! Thinking about it some more, ByteString is practically a BufferSource; its only essential difference is that the ECMAScript Binding layer spits out a JS string instead of an ArrayBuffer (for obvious historical reasons). Thus, I'll remove ByteString from the string ops and add it to the buffer ops.

annevk · 2019-03-30T19:33:00Z

Trapping is some kind of error? So if you store a lone surrogate in a Text node's data in JavaScript you cannot get it out through Wasm? Mapping the lone surrogates to U+FFFD seems preferable if so.

cc @hsivonen

alexcrichton · 2019-03-30T19:55:51Z

We've been discussing the problem of lone surrogates recently with wasm-bindgen, and our default usage of TextEncoder to convert a JS string to a Rust utf-8 string does use the replacement character today. What it means though is that we didn't realize that the conversion from a JS string to a Rust string was lossy initially, although we do know now!

Our current thinking is that we will add a method to detect if a JS string is invalid utf-16 (aka has a lone surrogate), and then code that wants to be robust will do the moral equivalent of taking anyref and doing runtime checking.

In that sense I think that I might agree with @annevk that using replacement characters on lone surrogates might make more sense since it matches TextEncoder's behavior. It would be important to document though!

Pauan · 2019-03-30T20:01:06Z

To clarify a bit, when the user types a single character in an <input> on Windows, it will send two input events (one for each surrogate pair).

So when the first input event happens, it sends the string from JS to Rust (which uses TextEncoder to convert from UTF-16 to UTF-8), and TextEncoder replaces the unpaired surrogate with the replacement character.

That's pretty terrible since now the string doesn't match what the user typed (and this then leads to other bugs).

So personally I would prefer a hard error, but I agree with @alexcrichton that matching existing TextEncoder behavior is probably correct (albeit unfortunate).

annevk · 2019-03-30T20:08:32Z

@Pauan is there an issue tracking that particular way of dealing with user input? That seems like something that ought to be fixed in implementations (and specification, if it's unclear).

(And yeah, TextEncoder expects the caller to not pass in lone surrogates (because it uses USVString, which replaces lone surrogates with U+FFFD, not DOMString, which preserves them). We could potentially add a static or some such that tells you whether your string contains them. Please file something at https://github.com/whatwg/encoding if that would be useful.)

Pauan · 2019-03-30T20:21:05Z

@annevk is there an issue tracking that particular way of dealing with user input?

On our side we have rustwasm/wasm-bindgen#1348 (linked earlier by @alexcrichton).

The solution we decided upon was to just completely ignore input events which contain unpaired surrogates. This of course requires detecting (in JS) whether a string contains unpaired surrogates or not.

As for a tracking issue in browsers or the spec, not that I'm aware of. But then again, I haven't really looked.

That seems like something that ought to be fixed in implementations (and specification, if it's unclear).

If it could be fixed in browsers, then that would be fantastic! Right now we have to ignore input events which contain unpaired surrogates.

We could potentially add a static or some such that tells you whether your string contains them.

Yeah, that would be useful! Right now we have to use a function like this, but it can probably be implemented faster in browsers.

lukewagner · 2019-04-01T15:13:10Z

Oops, right. Ultimately, I assumed the spec here would delegate to TextEncoder, so I'll just say that explicitly instead of trying to incorrectly guess what it would do.

proposals/webidl-bindings/Explainer.md

@chicoxyzzy

Thanks @chicoxyzzy! Co-Authored-By: lukewagner <mail@lukewagner.name>

fgmccabe · 2019-04-04T21:55:27Z

proposals/webidl-bindings/Explainer.md

+statement:
+
+```wasm
+(webidl-type $Contact (dict (field "name" DOMString) (field "age" long)))


IMO, the text format of this should be as close as possible to the 'official' syntax for webIDL itself.

I think consistency with the existing text format is also a useful property, but probably we can discuss this in a follow-up issue and iterate on the explainer.

Can we try to uniformly render custom sections in the text format by means of the proposed custom annotation syntax? That would prevent all sorts of compatibility and tooling issues.

Ah, I forgot all about that. So, iiuc, the request is to make sure that any new syntax is contained by some (@webidl ...), and then otherwise the ... can be anything (as long as it follows the basic paren-matching rules)?

Within limits. It has to be lexible as Wasm tokens. That covers a wide range of lexical syntax (because Wasm identifiers can be almost anything), but we'd need to extend the proposal slightly if we wanted to allow other kinds of brackets as well or "foreign" punctuation like commas and semicolons. Those wouldn't be a problem. (Though it would be a problem to allow completely different lexing rules, mainly because of parens in comments and strings.)

(That said, I wouldn't recommend going wild with annotation syntax but rather keep the spirit of the surrounding Wasm syntax.)

Heh, that makes sense, but for the two practical options we're considering here (wat-style or Web IDL), it sounds like we're fine, so the only real requirement is, whatever it is, it is seated in a (@webidl ...), not a (webidl...), right?

Righto, will fix, thanks for pointing that out.

lukewagner · 2019-04-05T00:35:57Z

Following the unanimous poll at the last CG meeting and no objections since, I'll merge, rename the repo, and we can continue to iterate on the design going forward.

fgmccabe · 2019-04-05T04:09:08Z

In this case there is clear conflict: between the prior precedents of webidl and that of WebAssembly. IMO webidl should prevail because otherwise there will be a lot of unnecessary reformatting needed. Furthermore we would end up having to restate the webidl specification and also having to own that restatement. On the other hand there is no binary representation of webidl and so we have to invent it. In that case the binary format is consistent with WebAssembly

On Thu, Apr 4, 2019 at 8:57 PM Andreas Rossberg ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In proposals/webidl-bindings/Explainer.md <#21 (comment)> : > +used as Web IDL Types. + +As an example, the Web IDL Dictionary: + +```WebIDL +dictionary Contact { + DOMString name; + long age; +}; +``` + +could be defined in a WebAssembly module with this (strawman text-format) +statement: + +```wasm +(webidl-type $Contact (dict (field "name" DOMString) (field "age" long))) Can we try to uniformly render custom sections in the text format by means of the proposed custom annotation syntax <https://github.com/WebAssembly/annotations/blob/master/proposals/annotations/Overview.md>? That would prevent all sorts of compatibility and tooling issues. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACAL0IrVlkpPwNDy21mZhDWLEhE3Bu83ks5vdsnAgaJpZM4cEhyN> .

-- Francis McCabe SWE

rossberg · 2019-04-05T04:30:28Z

FWIW, the custom annotation syntax should probably be able to embed existing WebIDL syntax:

(@webidl
  dictionary Contact {
    DOMString name;
    long age;
  };
)

would be syntactically well-formed. Whether it's desirable stylistically I'll leave for others to decide.

lukewagner · 2019-04-05T05:00:08Z

Good to know; probably best to open a new issue to discuss Web IDL vs s-expression syntax.

littledan · 2019-04-06T15:38:21Z

I'd suggest not using the WebIDL surface syntax, as this has changed over time and is continuing to change.

Luke Wagner added 3 commits March 22, 2019 16:57

First draft

1c46f9f

Resize images

d8d6000

Replace the last TODOs

ec96ae3

lukewagner changed the title ~~Refresh~~ Refresh, rescope and rename Mar 22, 2019

jgravelle-google reviewed Mar 25, 2019

View reviewed changes

littledan approved these changes Mar 25, 2019

View reviewed changes

binji reviewed Mar 27, 2019

View reviewed changes

alexcrichton reviewed Mar 27, 2019

View reviewed changes

proposals/webidl-bindings/Explainer.md Outdated Show resolved Hide resolved

Luke Wagner added 3 commits March 27, 2019 17:16

Address most of Ben's comments

a8d7e7f

Fix mangled sentence

c07af1d

Improve binding operator table readability

6568e70

lukewagner force-pushed the refresh branch from dbe82b4 to 6568e70 Compare March 27, 2019 22:41

Luke Wagner added 2 commits March 27, 2019 17:53

Add JS instantiation code to encodeInto example

383eeb3

Rename ctor field to avoid confusing syntax highlighting

ee4f4f1

chicoxyzzy mentioned this pull request Mar 28, 2019

Relation to WebAssembly WebIDL Bindings proposal tc39/proposal-idl#3

Open

aardappel reviewed Mar 28, 2019

View reviewed changes

proposals/webidl-bindings/Explainer.md Outdated Show resolved Hide resolved

proposals/webidl-bindings/Explainer.md Outdated Show resolved Hide resolved

Luke Wagner added 3 commits March 28, 2019 18:09

Add 'buffer' operator as suggested by Wouter

803ed0b

Add comma

b4f2f77

Fix typo, add emphasis

1d57b39

Address Anne's comments

bfa1a8e

Luke Wagner added 3 commits March 29, 2019 15:52

Add missing apostrophe

5cc59e8

Fix reference links

dd5fbb3

Fix another reference link

869a419

Explicitly refer to the Encoding spec for string binding operators

ba75b91

fgmccabe reviewed Apr 1, 2019

View reviewed changes

proposals/webidl-bindings/Explainer.md Show resolved Hide resolved

bjfish mentioned this pull request Apr 2, 2019

Host <-> Sandbox interop wasmerio/wasmer#315

Closed

This was referenced Apr 2, 2019

Consider adding TextEncoder.containsLoneSurrogates() static whatwg/encoding#174

Closed

UI events containing lone surrogates w3c/uievents#227

Open

bjfish reviewed Apr 2, 2019

View reviewed changes

proposals/webidl-bindings/Explainer.md Show resolved Hide resolved

chicoxyzzy reviewed Apr 2, 2019

View reviewed changes

chicoxyzzy and others added 5 commits April 2, 2019 12:16

Add wasm and WebIDL syntax highlighting

848a70f

Thanks @chicoxyzzy! Co-Authored-By: lukewagner <mail@lukewagner.name>

Rename 'buffer' to 'copy' to account for addition of ByteString

a209603

Clarify wording for JS API instantiation

4a32d20

Add missing word

1d0bd9e

Update encodeInto diagram to match result binding map

94f0aea

binji mentioned this pull request Apr 4, 2019

Binary encoding of webIDL #22

Merged

fgmccabe reviewed Apr 4, 2019

View reviewed changes

lukewagner merged commit e52ecda into master Apr 5, 2019

lukewagner deleted the refresh branch April 5, 2019 00:36

lukewagner restored the refresh branch April 5, 2019 00:45

lukewagner deleted the refresh branch April 5, 2019 00:46

cldershem mentioned this pull request Apr 10, 2019

Link to "host bindings proposal" is broken. rustwasm/book#162

Closed

This was referenced Apr 11, 2019

formatting tweaks for the overview #15

Closed

Update Explainer strawman text format to conform to Custom Annotation Syntax #24

Merged

lukewagner pushed a commit that referenced this pull request Nov 22, 2021

Put module and instance types into their own index spaces (#21)

0bd68f7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refresh, rescope and rename #21

Refresh, rescope and rename #21

lukewagner commented Mar 22, 2019 •

edited

Loading

lukewagner commented Mar 22, 2019

jgravelle-google left a comment

littledan left a comment

lukewagner commented Mar 26, 2019

binji left a comment

annevk commented Mar 29, 2019

lukewagner commented Mar 29, 2019

annevk commented Mar 30, 2019

alexcrichton commented Mar 30, 2019

Pauan commented Mar 30, 2019 •

edited

Loading

annevk commented Mar 30, 2019 •

edited

Loading

Pauan commented Mar 30, 2019

lukewagner commented Apr 1, 2019 •

edited

Loading

fgmccabe Apr 4, 2019

lukewagner Apr 5, 2019

rossberg Apr 5, 2019

lukewagner Apr 5, 2019

rossberg Apr 5, 2019

rossberg Apr 5, 2019

lukewagner Apr 5, 2019

rossberg Apr 6, 2019

lukewagner Apr 6, 2019

lukewagner commented Apr 5, 2019

fgmccabe commented Apr 5, 2019 via email

rossberg commented Apr 5, 2019 •

edited

Loading

lukewagner commented Apr 5, 2019

littledan commented Apr 6, 2019

Refresh, rescope and rename #21

Refresh, rescope and rename #21

Conversation

lukewagner commented Mar 22, 2019 • edited Loading

lukewagner commented Mar 22, 2019

jgravelle-google left a comment

Choose a reason for hiding this comment

littledan left a comment

Choose a reason for hiding this comment

lukewagner commented Mar 26, 2019

binji left a comment

Choose a reason for hiding this comment

annevk commented Mar 29, 2019

lukewagner commented Mar 29, 2019

annevk commented Mar 30, 2019

alexcrichton commented Mar 30, 2019

Pauan commented Mar 30, 2019 • edited Loading

annevk commented Mar 30, 2019 • edited Loading

Pauan commented Mar 30, 2019

lukewagner commented Apr 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukewagner commented Apr 5, 2019

fgmccabe commented Apr 5, 2019 via email

rossberg commented Apr 5, 2019 • edited Loading

lukewagner commented Apr 5, 2019

littledan commented Apr 6, 2019

lukewagner commented Mar 22, 2019 •

edited

Loading

Pauan commented Mar 30, 2019 •

edited

Loading

annevk commented Mar 30, 2019 •

edited

Loading

lukewagner commented Apr 1, 2019 •

edited

Loading

rossberg commented Apr 5, 2019 •

edited

Loading