diff --git a/proposals/module-linking/Binary.md b/proposals/module-linking/Binary.md index 0f73a56c..65c54ccd 100644 --- a/proposals/module-linking/Binary.md +++ b/proposals/module-linking/Binary.md @@ -62,7 +62,8 @@ referencing: * each `importdesc` is valid according to import section * Types can only reference preceding type definitions. This forces everything to - be a DAG and forbids cyclic types. + be a DAG and forbids cyclic types. TODO: with type imports we may need cyclic + types, so this validation will likely change in some form. ## Import Section updates @@ -89,12 +90,12 @@ importdesc ::= * the `module x` production ensures the type `x` is indeed a module type * the `instance x` production ensures the type `x` is indeed an instance type -## Module section (100) +## Module section (14) A new module section is added ``` -modulesec ::= t*:section_100(vec(typeidx)) -> t* +modulesec ::= t*:section_14(vec(typeidx)) -> t* ``` **Validation** @@ -103,12 +104,12 @@ modulesec ::= t*:section_100(vec(typeidx)) -> t* * This defines the locally defined set of modules, adding to the module index space. -## Instance section (101) +## Instance section (15) A new section defining local instances ``` -instancesec ::= i*:section_101(vec(instancedef)) -> i* +instancesec ::= i*:section_15(vec(instancedef)) -> i* instancedef ::= 0x00 m:moduleidx e*:vec(exportdesc) -> {instantiate m, imports e*} ``` @@ -122,24 +123,28 @@ the future we'll likely want this binary value to match that. **Validation** -* The type index `m` must point to a module type. +* The module index `m` must be in bounds. * Indices of items referred to by `exportdesc` are all in-bounds. Can only refer to imports/previous aliases, since only those sections can precede. -* The precise rules of how `e*` is validated against the module type's declare - list of imports is being hashed out in - [#7](https://github.com/WebAssembly/module-linking/issues/7). For now - conservative order-dependent rules are used where the length of `e*` must be - the same as the number of imports the module type has. Additionally the type - of each element of `e*` must be a subtype of the type that it's being matched - with. Matching happens pairwise with the list of imports on the module type - for `m`. +* The `e*` list is validated against the module type's declared list + of [imports pairwise and in-order](Explainer.md#module-imports-and-nested-instances). + The type of each item must be a subtype of the expected type listed in the + module's type. -## Alias section (102) +**Execution notes** + +* The actual module being instantiated does not need to list imports in the + exact same order as its type declaration. The `e*` has names based on the + local module type's declaration. +* Be sure to read up on [subtyping](./Subtyping.md) to ensure that instances + with a single name can be used to match imports with a two-level namespace. + +## Alias section (16) A new module section is added ``` -aliassec ::= a*:section_102(vec(alias)) -> a* +aliassec ::= a*:section_16(vec(alias)) -> a* alias ::= 0x00 i:instanceidx 0x00 e:exportidx -> (alias (instance $i) (func $e)) @@ -154,9 +159,10 @@ alias ::= **Validation** -* Aliased instance indexes are all in bounds +* Aliased instance indexes are all in bounds. Remember "in bounds" here means it + can't refer to instances defined after the `alias` item. * Aliased instance export indices are in bounds relative to the instance's - *locally-declared* (via module or instance type) list of exports + *locally-declared* (via module or instance type) list of exports. * Export indices match the actual type of the export * Aliases append to the respective index space. * Parent aliases can only happen in submodules (not the top-level module) and @@ -169,6 +175,12 @@ alias ::= items were declared after the module's type in the corresponding module section. +**Execution notes** + +* Note for child aliases that while the export is referred to by index it's + actually loaded from the specified instance by name. The name loaded + corresponds to the `i`th export's name in the locally defined type. + ## Function section **Validation** @@ -191,10 +203,10 @@ exportdesc ::= * Module/instance indexes must be in-bounds. -## Module Code Section (103) +## Module Code Section (17) ``` -modulecodesec ::= m*:section_103(vec(modulecode)) -> m* +modulecodesec ::= m*:section_17(vec(modulecode)) -> m* modulecode ::= size:u32 mod:module -> mod, size = ||mod|| ``` @@ -208,31 +220,9 @@ the top-level * Module definitions must match their module type exactly, no subtyping (or maybe subtyping, see WebAssembly/module-linking#9). * Modules themselves validate recursively. -* Must have the same number of modules as the count of all local module - sections. -* Each submodule is validated with a subset of the parent's context, for example - the set of types and instances the current module has defined are available - for aliasing in the submodule. - -## Subtyping - -Subtyping will extend what's currently ["import -matching"](https://webassembly.github.io/spec/core/exec/modules.html#import-matching) - -**Instances** - -Instance `{exports e1}` is a subtype of `{exports e2}` if and only if: - -* Each name in `e1` is present in `e2` -* For each corresponding name `n` in the sets - * `e1[n]` is a subtype of `e2[n]` - -**Instances** - -Module `{imports i1, exports e1}` is a subtype of `{imports i2, exports e2}` if and only if: - -* Each name in `e1` is present in `e2` -* For each corresponding name `n` in the sets - * `e1[n]` is a subtype of `e2[n]` -* ... And some condition on imports. For now this is a bit up for debate on - WebAssembly/module-linking#7 +* The module code section must have the same number of modules as the count of + all local module sections. +* Each submodule is validated with the parent's context at the time of + declaration. For example the set of types and modules the current module + has defined are available for aliasing in the submodule, but only those + defined before the corresponding module's type declaration. diff --git a/proposals/module-linking/Explainer.md b/proposals/module-linking/Explainer.md index d7f174a5..2cf0f6d8 100644 --- a/proposals/module-linking/Explainer.md +++ b/proposals/module-linking/Explainer.md @@ -284,7 +284,7 @@ example, an instance type can be defined: In many examples shown below, type definitions are needed for *both* a module type and the instance type produced when that module type is instantiated. In -such cases, to avoid duplicating all the exports, a new "zero-level export" +such cases, to avoid duplicating all the exports, a new "zero-level export" `(export $InstanceType)` form is added which injects all the exports of `$InstanceType` into the containing module type. For example, here is the type of a module which implements the above-defined `$WasiFile` interface via Win32 @@ -363,9 +363,9 @@ version of the same module can be written: (export "f1" (func $f1)) (export "f2" (func $f2 (param i32))) )) - (alias $i.f1 (func $i $f1)) + (alias $i.$f1 (instance $i) (func $f1)) (func (export "run") - call $i.f1 + call $i.$f1 ) ) ``` @@ -377,17 +377,17 @@ how multiple uses of inline function types [desugar][typeuse-abbrev] to the same function type definition. Aliases are not restricted to functions: all exportable definitions can be -aliased. One situation where an explicit `alias` definition will be required is -for a default memory or table: because there is no explicit `$i.$j` path used by -instructions to refer to defaults, they must be explicitly aliased: +aliased. One situation where an explicit `alias` definition may be required is +for a default memory or table since if there is no explicit `$i.$j` path used +by instructions to refer to defaults, they must be explicitly aliased: ```wasm (module (import "libc" (instance $libc (export "memory" (memory $mem 1)) (export "table" (table $tbl 0 funcref)) )) - (alias (memory $libc $mem)) ;; memory index 0 = default memory - (alias (table $libc $tbl)) ;; table index 0 = default table + (alias (instance $libc) (memory $mem)) ;; memory index 0 = default memory + (alias (instance $libc) (table $tbl)) ;; table index 0 = default table (func ... i32.load ;; accesses $libc.$mem @@ -544,11 +544,17 @@ the child's type index space: (export "read" (func (param i32 i32 i32) (result i32))) )) (module $child - (alias $WasiFile (parent (type $WasiFile))) + (alias $WasiFile parent (type $WasiFile)) (import "wasi_file" (instance (type $WasiFile))) ) ) ``` +Note that `parent` aliases can only refer to previously-defined items relative +to the module's own declaration in the module index space. This means that it +can refer to previously defined imports, modules, instances, or aliases, but it +cannot refer to imports (for example) that occur after the module's +declaration. A module is declared with its type and defined later in the binary +format. In general, language-independent tools can easily merge multiple `.wasm` files in a dependency graph into one `.wasm` file by performing simple transformations @@ -673,6 +679,9 @@ could be encoded with the binary section sequence: 10. ModuleCode Section, defining `$M` 11. Code Section, defining `$x` +This repository also contains an [initial proposal for the binary format +updates](./Binary.md). + ### Summary @@ -764,7 +773,7 @@ More generally, when ESM-integration loads a module with a With this extension, a single JS app will be able to load multiple wasm programs using ESM `import` statements and have these programs safely and -transparently share library code as described in +transparently share library code as described in [shared-everything dynamic linking example](Example-SharedEverythingDynamicLinking.md). diff --git a/proposals/module-linking/Subtyping.md b/proposals/module-linking/Subtyping.md new file mode 100644 index 00000000..f348fd22 --- /dev/null +++ b/proposals/module-linking/Subtyping.md @@ -0,0 +1,203 @@ +# Subtyping + +Subtyping will extend what's currently ["import +matching"](https://webassembly.github.io/spec/core/exec/modules.html#import-matching). +The main gotcha with subtyping primarily comes around with how two module +types' import lists are related. This will go into more detail below on that. +First though: + +## Instance Subtyping + +Instance types are a map of exports from the name of exports to the type of the +export. We can generally say that for one map `e1` to be a subtype of another +map `e2` then everything in `e2` must be present in `e1`: + +``` +∀ name ∈ e2 : e1[name] ≤ e2[name] +--------------------------------- + e1 ≤ e2 +``` + +And since that's all instance types are we can just use that to define subtyping +between instances: + +``` + e1 ≤ e2 +------------------------------- + {exports e1} ≤ {exports e2} +``` + +With some examples: + +```wasm +(instance) ≤ (instance) + +;; If asked for something that doesn't have exports and provided something that +;; has exports, that's ok. +(instance (export "" (func))) ≤ (instance) + +;; Asked for something that exports an instance with no fields, it's ok to +:; provide an instance which exports an instance with exports. When that +;; instance's nested export is loaded we don't think it has anything, so it's ok +;; that it does. +(instance (export "" (instance (export "e" (func))))) + ≤ (instance (export "" (instance))) +``` + +## Import Elaboration + +modules are trickier than instances, so first this will describe some tweaks to +imports in the current proposal. It's important to note [that all two-level +imports are now equivalent to instance +imports](./Explainer.md#instance-imports-and-aliases), meaning that all +imports are a name and a type. The proposed method of elaboration is to group +all two-level imports with the same module name next to one another, and then +flatten each group of two-level imports into one one-level import of an +instance with appropriately named and typed exports. + +For example this: + +```wasm +(module $A + (import "a" "foo" (func)) + (import "b" "bar" (func)) + (import "a" "baz" (func)) +) +``` + +becomes this: + +```wasm +(module $A + (import "a" (instance + (export "foo" (func)) + (export "baz" (func)) + )) + (import "b" (instance + (export "bar" (func)) + )) +) +``` + +A caveat with this, however, is that this module + +```wasm +(module $A + (import "" "a" (func)) + (import "" "a" (func (result i32))) +) +``` + +cannot be recast as: + +```wasm +(module + (import "" (instance + (export "a" (func)) + (export "a" (func (result i32)) + )) +) +``` + +because exports in instances must have unique names. As a result **this proposal +proposes a breaking change** to forbid duplicate import strings between two +imports (more discussion about this breaking change is available on +[#7](https://github.com/WebAssembly/module-linking/issues/7) and [below as +well](#breaking-change)). This would make module `$A` an invalid module. + +Note that this breaking change would also mean that these modules are all +invalid: + +```wasm +(module + (import "" (func)) + (import "" "a" (func)) +) + +(module + (import "" (instance)) + (import "" "a" (func)) +) + +(module + (import "" (instance)) + (import "" (instance)) +) +``` + +## Module Subtyping + +With the import elaboration above it means that imports are actually the same +thing as exports, a map from name to a type. Module `{imports i1, exports e1}` +is then a subtype of `{imports i2, exports e2}` if this property holds: + +``` + i2 ≤ i1 e1 ≤ e2 +--------------------------------------------------- +{imports i1, exports e1} ≤ {imports i2, exports e2} +``` + +Handling of exports is the same as instances, which basically means that it's ok +if `m1` exports more than what `m2` expects. The subtelty with imports is that +the order of checking is reversed, it's ok to import less than what's +expected. + +Some examples of valid modules are: + +```wasm +(module) ≤ (module) + +;; If asked for something that needs an import and provided something that +;; doesn't need any imports, that's ok +(module) ≤ (module (import "" (instance))) + +;; Asked for something that imports an instance with a field export, it's ok to +;; provide a module which imports an instance with no exports. When that module +;; is given the instance-with-exports, it just ignores it. +(module (import "" (instance))) + ≤ (module (import "" (instance (export "e" (func))))) +``` + +## Instantiation + +Instantiation primarily occurs via name-based resolution (e.g. in the JS API and +other language embeddings) or position-based resolution (e.g. embedded engines). + +It's expected that the original import list of a module is retained to map +positional-based resolution to name-based resolution. With positional-based +resolution imports would need to be provided as-is with the module in question +(no sugar applied where you can supply an instance for a function import). This +enables the engine to transform a list of imports into a map from import name to +an instance with exports. + +Name-based resolution wouldn't need to change too too much, it would allow +top-level names to be defined with actual wasm instances or further host-defined +maps of strings. Instance imports could then be satisfied with maps-of-strings +so long as all the strings line up. Note that the `instantiate` instruction is +expected to use named-based instantiation. + +In both cases instantiation is intended to become primarily name-based. This +matches the intended behavior of the `instantiate` instruction in a wasm module +which is to name all the provided items according to the declared module type +that's being instantiated. This also enables embedders to work with instances +supplied to satisfy a list of function imports. For example embedders would take +a singular "wasi instance" to satisfy all wasi function imports from a module. + +## Breaking change?! + +This entire interpretation of imports relies on the aforementioned breaking +change, which is to disallow two import directives with the same names. The +rationale for this breaking change is: + +* It's expected that in practice different import directives with the same name + is exceedingly rare. So far the only confirmed cases are in test harnesses + where you might import `console.log` with two signatures for example. It's + hoped we can [collect + data](https://bugzilla.mozilla.org/show_bug.cgi?id=1647791) to back up this + claim. + +* Another hope is that the spec can change to disallow duplicate imports, but + enginess with backwards-compatibility guarantees could deviate from the spec + in this regard and allow duplicate import strings in older modules. This way + engines that can could follow the spec exactly, and if necessary engines + wouldn't have to break existing content.