Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolated realms with sync messaging passing #289

Closed
domenic opened this issue Jan 20, 2021 · 61 comments
Closed

Isolated realms with sync messaging passing #289

domenic opened this issue Jan 20, 2021 · 61 comments

Comments

@domenic
Copy link
Member

domenic commented Jan 20, 2021

Hi realms champions,

@syg and I have been considering a modification to the current realms proposal which trades some expressivity, to give better isolation guarantees. Essentially, instead of allowing direct bidirectional access between the parent realm and the constructed realm, all such communication would go through structured cloning. This ensures that the child realm never gets access to objects from the parent realm, thus making "sandbox escapes" such as those in #277 or nodejs/node#15673 impossible by construction.

Sample code and API

We don't have strong feelings on the API for this; we'd like it to be as ergonomic as possible. But here is an initial idea.

For pulling values out of the constructed realm, into the parent realm: introduce realm.eval().

const realm = new Realm();

// value is a structured clone of the completion value
const value = realm.eval("[1, { foo: 'bar' }]");

// Its prototype chain is thus based on *the parent realm*'s intrinsics
console.assert(value.__proto__ === Array.prototype);
console.assert(value[1].__proto__ === Object.prototype);

// If you try to get access to a constructed realm's prototype or constructor,
// you get a clone, which isn't very useful:

try {
  const value = realm.eval("Array");
} catch (e) {
  // Can't clone functions
}

const value2 = realm.eval("Array.prototype");
// value2 is an empty plain object (structured clone only clones enumerable properties)

For getting values into the constructed realm, from the parent realm, a bare minimum might look like this:

const realm = new Realm();

realm.set("foo", "bar");
console.assert(realm.eval("globalThis.foo === 'bar'") === true);

but you could imagine something slightly more complicated, and more useful, such as

const realm = new Realm();
realm.eval("globalThis.add = (x, y) => x + y");

const result = realm.call("add", 2, 3);
console.assert(result === 5);

(Compared to async realm boundaries on the web, this solves similar use cases to webRealm.postMessage().)

Finally, for pushing values out of the constructed realm into the parent realm, you'd need something like this:

const realm = new Realm({
  handler(...args) {
    console.assert(args[2].__proto__ == Object.prototype);
    console.log(args);
  },
  exposeHandlerAs: "callParent"
});

console.log(realm.eval("globalThis.callParent.toString()"));
// logs "function () { [native code] }": callParent is installed by the implementation inside
// the constructed realm, and structured-clones its arguments to pass to handler()
// in the outer realm.

realm.eval("globalThis.callParent(1, 2, { foo: 'bar' })");
// logs 1, 2, and (a structured clone of) the { foo: 'bar' } object

(Compared to async realm boundaries on the web, this solves similar use cases to webRealm.onmessage.)

And, of course, we'd remove realm.globalThis.

Use case analysis

This proposal is arguably better than the current one for many sandboxing use cases. In particular, for cases such as templating or computation where the goal is to have a (conceptually) pure function execute inside the realm, this architecture is ideal, especially in how it automatically prevents "impurities" from cross-realm contamination. In such cases, the values passed are often primitives, or if not, they're within the realm of structured clone: plain objects, arrays, maybe some Maps and Sets and Errors and Dates and typed arrays/ArrayBuffers.

For cases such as a virtualized environment, it requires more work, but probably on about the same level membrane-based approaches. That is, to perform operations inside the realm while interfacing with a same-realm object API, you would have to create proxies (either literal Proxys or just wrappers) which perform the appropriate calls to realm.eval() and realm.call(). And similarly for the reverse: if code inside the realm wants to operate on a inside-realm object while really doing work in the outside realm, the outside realm would need to do some setup, using realm.eval() to inject some proxies which call globalThis.callParent(). (Probably that setup code would then also do realm.eval("delete globalThis.callParent") at the end.) This is equivalent to what is being done today in the AMP WorkerDOM example that the explainer cites, but by using synchronous realm.call() etc. instead of asynchronous worker.postMessage(), it would overcome the challenges you discuss there.

Other use cases like running tests in the realm fall in between. You'd need to inject a small shim into the realm which provides globals that the test library depends on (such as console), proxied to the creator realm. But then you'd just run the test library inside the global. I.e. instead of the explainer's current sample code, you'd write

import { consoleShimOutside } from 'console-shim';
const realm = new Realm(consoleShimOutside);
realm.import('console-shim/inside');
realm.import('testFramework');
realm.import('./main-spec.js');

This also gives you a stronger guarantee that tests don't mess with the test framework, or with the outer realm, or with other tests, all of which are possible in the current explainer's sample code.

This proposal does lose some expressivity though. In particular, it is not able to create reference cycles between cross-realm objects. Because all communication is via cloned messages, there's no way to communicate to the garbage collector that an object in the outer realm depends on an object in the inner realm, and vice versa, so that the cycle can take part in liveness detection. To some extent this is a good thing, as cycles are an easy way to leak an entire realm. But from what I understand it does cut off some use cases that go beyond the ones mentioned in the current realms explainer.

Performance

Adding a structured clone step for all boundary-crossing operations could come at a performance cost. But, less so than you'd imagine.

In particular, since primitives are trivially cloneable, any operation which returns them would suffer virtually no overhead vs. the current realms proposal, when communicating across the boundaries. This can account for a large number of use cases: e.g., most computation use cases, or the test framework use case (where it's just passing console.log strings across), or many of the interesting virtualization cases. Other cases will be covered by small objects or arrays, for which the structured clone overhead is quite small (less than JSON serialization and parsing). It's only the case of needing to return a large, nested object graph where there might be a noticeable performance disadvantage.

It's also worth noting that although this proposal does have a lower theoretical performance ceiling than the current realms proposal, it's probably comparable to the current realms proposal plus the associated membrane machinery that's needed to preserve encapsulation. There might be interesting tradeoffs in the large nested object graph case. There, structured cloning across the boundary means a larger up-front cost, but after that initial cost is paid, subsequent accesses throughout the large object graph are fast and well-optimized. Whereas membrane use means every access throughout the wrapped object graph incurs membrane-related overhead.

Finally, I haven't thought much about this, but you could probably get ultimate performance™ by passing in a SharedArrayBuffer and doing all communication through that.

Conclusion

I'm optimistic that this proposal removes the most dangerous feature of realms, which is that they advertise themselves as an encapsulation mechanism, but it is extremely easy to shoot oneself in the foot and break encapsulation. This encapsulated-by-default proposal would bring realms onto the same footing as other encapsulation proposals such as trusted types or private fields, and thus make it more congruent with web platform goals.

There still remains a danger with people over-using realms when they need security or performance isolation, beyond just encapsulation. This still weighs heavily on me, and its conflict with the direction the web is going (per #238) makes me still prefer not providing a realms API at all, in order to avoid such abuse. But I recognize there are cases where synchronous access to another computation environment is valuable, and I think if we curtailed the footgun-by-default nature of realms by prohibiting direct cross-realm object access, I could make peace with the proposal.

I look forward to hearing your thoughts, and hope we can meet on this "middle ground" between no realms on the web, and the current proposal.

@Jamesernator
Copy link

Jamesernator commented Jan 20, 2021

My thoughts are that this is a bit weak by itself due to the lack of cross-realm references. However it does make me wonder if something like a sync version of what Puppeteer/Playwright do with JSHandles + ability to structure clone a handle would work.

e.g. For example:

const realm = new Realm();

const arrayHandle = realm.evaluateHandle(`[1,2,3]`);

// Push an item into the array in the realm, non-handles are structured cloned
realm.evaluate(`array.push(value)`, { array: arrayHandle, value: 4 });

// We can also ask for a handle to be structured cloned back to us
const arrayClone = arrayHandle.cloneIntoThisRealm();

console.log(arrayClone); // [1,2,3,4]

For things like callback patterns, the ability to have handles works naturally, for example suppose we wanted to expose something like setTimeout in the realm:

const realm = new Realm();

const realmSetTimeout = realm.createFunction(
    function setTimeout(delayHandle, callbackHandle, ...callbackArgumentHandles) {
        const delay = delayHandle.cloneIntoThisRealm();
        setTimeout(delay, () => {
            realm.evaluate(`callback(...args)`, {
                callback: callbackHandle,
                args: callbackArgumentHandles,
            });
        });
    }
);

realm.globalThisHandle.setPropertyDescriptor('setTimeout', { 
    value: realmSetTimeout,
    enumerable: true,
    configurable: true,
    writable: true,
});

A rough API would be something like this:

type StructuredClonable = ...;

class RealmHandle {
  // Clone the value from the realm into this Realm using structured clone
  cloneIntoThisRealm(): StructuredClonable;

  // Object meta operations, same as Reflect.* except 
  // accept RealmHandles and return RealmHandles
  apply(
    target: RealmHandle,
    thisArgument: RealmHandle | StructuredClonable,
    argumentsList: RealmHandle | Array<RealmHandle | StructuredClonable> | StructuredClonable>;
  ): RealmHandle;
  construct(...) ...
  ...
}

class Realm {
  // A RealmHandle for the global object
  get globalThisHandle(): RealmHandle;

  // Evaluate and clone the return value, this is basically a shortcut for
  // .evaluateHandle().cloneIntoThisRealm();
  evaluate(
      code: string,
      scope: Record<string, RealmHandle | StructuredClonable>,
  ): StructuredClonable;
  evaluateHandle(
      code: string,
      scope: Record<string, RealmHandle | StructuredClonable>,
  ): RealmHandle;
  // Creates a function in the Realm that calls the given function with RealmHandles
  // for all arguments passed to it
  createFunction(
      func: (...args: any[]) => StructuredClonable,
      scope: Record<string, RealmHandle | StructuredClonable>,
  ): RealmHandle;
}

@leobalter
Copy link
Member

@domenic I want us to meet at a middle-ground for sure. I asked my teammates to take a look.

In particular, it is not able to create reference cycles between cross-realm objects. Because all communication is via cloned messages, there's no way to communicate to the garbage collector that an object in the outer realm depends on an object in the inner realm, and vice versa, so that the cycle can take part in liveness detection.

This is one of the concerns that requires specific review, I appreciate that you're weighing the options here.

@Jack-Works
Copy link
Member

One of the problems is that we don't have structured cloning in the language. I have a proposal for that https://github.com/Jack-Works/proposal-serializer if you're interested.

@caridy
Copy link
Collaborator

caridy commented Jan 21, 2021

@domenic thanks for putting the time to do this write up, we have been debating this with @syg for few weeks, in fact, we did some prototyping around it to measure perf (I believe that's not a deal breaker, which is a good news). Also, we did some homework to see if some of the existing membranes will work with this proposal, and here is where things become complicated (as @leobalter mentioned above).

One thing that occurs to me is that maybe we can provide other internal mechanism to help overcome this issue. I honestly don't even know if this is possible, but here is an idea:

realm.eval(`globalThis.foo = { x: 1 }`);
realm.eval(`Array.prototype`) === realm.eval(`Array.prototype`); // to yield `true`
realm.eval(`foo`) === realm.eval(`foo`); // to yield `true`

Basically, what I'm asking is if the UA can do some ref-tracking across this boundary, so the incubator realm's ref (in this case empty plain object based on your example) can continue to be linked to the corresponding ref from the realm, and release that memory when the realm doesn't need access to that object anymore.

This is clearly a lot different than the already well establish structured cloning algo, but a variation of it to reuse some references when possible. Since both realms are in the same process, it might be possible. Similarly, we will have to define how that works for Realm.set/call/etc. But the bottom line is that such mechanism will eliminate the necessity of doing any user-land book-keeping for references that needs to be tracked across references, clearing the way for a membrane to support any kind of virtualization.

@domenic
Copy link
Member Author

domenic commented Jan 21, 2021

I would be interested in seeing the explainer expanded to cover use cases where cross-realm cycles are important. I don't think "making existing membranes work" is a use case. I'd be interested to see things on the same level as the current explainer's "DOM virtualization" or "test frameworks". From what I can tell, no use case mentioned so far requires such cycles.

@leobalter
Copy link
Member

I'm writing an expansion over the sandbox use case and some thoughts for perhaps have a workaround to overcome @caridy's concerns. I'll post it back here when I have a proper review.

@leobalter
Copy link
Member

We have the TC39 plenary and a TAG Review meeting next week, they are too close for us to bring any good conclusion of potential next steps. I can say we are trying to understand better this new proposal and identify what might not work and how it would work for us. Our goal is to find an agreement point.

Meanwhile, I'm still going to expand the use cases in the explainer.

@caridy
Copy link
Collaborator

caridy commented Jan 22, 2021

@domenic can you expand a little bit about whether or not the realm will be running in the same process / same agent? or if it must run in a separate process? Or is it that you have intentionally left that part out of the proposal to let the UA to decide about that? I'm asking because that might be another differentiation aspect between the two of them. Can you clarify?

@domenic
Copy link
Member Author

domenic commented Jan 22, 2021

This proposal has them running in the same process, so that the communication is synchronous. I would of course prefer them to run in separate processes, and for the communication to be asynchronous, per #238, but such a modification does break some of the use cases mentioned in the explainer. This proposal is meant to cover all of the explainer's use cases and as such is synchronous/same process.

@jridgewell
Copy link
Member

jridgewell commented Jan 23, 2021

Basically, what I'm asking is if the UA can do some ref-tracking across this boundary, so the incubator realm's ref (in this case empty plain object based on your example) can continue to be linked to the corresponding ref from the realm, and release that memory when the realm doesn't need access to that object anymore.

I'm reasonably certain we can reuse the Membrane concept even with IsolatedRealms. Whenever an object crosses the boundary, instead pass a UUID (or anything unique, really). This will require wrapping the IsolatedRealm's methods calling into the realm, and the callParent calling outside the child.

For instance, say we had an stored in the foo variable inside the realm:

realm.eval(`foo`) === realm.eval(`foo`);

Here, eval would return a Proxy, and running it twice would have to return the same Proxy instance. So, we'll store a Map<uuid, WeakRef<Proxy>>. To get around the lifetime issues, we'll need the child realm to track foo with a FinalizationRegistry, and when foo is finally reclaimed, we can just tell the parent realm to delete the uuid.

Using a Membrane, we'd additionally be able to express a cycle without any issues. If foo.bar === foo, then performing const yFoo = realm.eval('foo'); yFoo.bar === yFoo will be true, too.

@jridgewell
Copy link
Member

jridgewell commented Jan 23, 2021

To ensure the child's foo isn't reclaimed while the parent's yFoo is still alive, the child realm will need to store a Map<Object, uuid> to strongly hold onto foo. And, the parent realm will need another FinalizationRegistry to notify it when yFoo is reclaimed. If so, tell the child to remove its key for foo, and we cleanup the memory cycle.

@Jamesernator
Copy link

Jamesernator commented Jan 24, 2021

Using a Membrane, we'd additionally be able to express a cycle without any issues. If foo.bar === foo, then performing const yFoo = realm.eval('foo'); yFoo.bar === yFoo will be true, too.

Unfortunately cross-system cycles using UUID's can leak memory as this issue on the WeakRef proposal demonstrates. Returning an object (or symbol if they're allowed as weakmap keys) would be better as the engine itself can track cycles properly.

EDIT: Actually it's unclear with the proxy's returned by eval if this suffers the same problem as the linked thread, this would need some investigation.

@erights
Copy link
Collaborator

erights commented Jan 24, 2021

WeakMaps enable cross-membrane cycles to be collected. WeakRefs do not. That was one of the first motivations for splitting the concepts.

@ByteEater-pl
Copy link

Why not both? It looks like with the proposal accepted the kind of realms proposed in this issue could be implemented in userland. So maybe just do it and make programmers widely aware of its advantages. If deemed desirable, they could be included in the language as well as the full version, e.g. as a subclass.

@caridy
Copy link
Collaborator

caridy commented Jan 25, 2021

I had a long conversation with @syg about the features needed for membranes to function. It seems to me and @syg that there are other things that we can do to solve it, and we can consider them orthogonal and complementary to what it is being proposed here. That, I believe, it is a good thing. We will try to put together some material to explore those other options as a separate proposal. For now, my focus is to try to understand all the details of what is being proposed, and the implications of such in the context of the current realm proposal.

@caridy
Copy link
Collaborator

caridy commented Jan 30, 2021

@domenic, I finally got a chance to look at this in detail, and I can say that I'm on board. This, IMO, is a good compromise. I will continue discussing it with other folks, for now, I have few notes:

  1. eval and handler are good in principle, but I will like to explore other names to avoid confusions, I will put some time on this next week.
  2. in the original proposal, error propagation was supposed to happen at the agent level (root realm with DOM semantics), what will be the proper way for the owner of the realm to deal with errors? including synchronous error (when calling eval), or errors triggered on the next turn? similarly, what can the realm do if the handler throws?
  3. I will work with @syg and @littledan on an idea about sharable identities that can work well with structured cloning, considering that a frozen empty object with a null __proto__ is powerless, and if we can mark them accordingly, we might be able to share them between the two realms. I'm thinking of something like Object.createSharableIdentity(), which produces something equivalent to Object.freeze(Object.create(null)), and marks it with an internal slot that can be used by the structured cloning algo to simply share those objects between realms just like any other primitive value, allowing keeping life references between realms in a weakmap. Something like that might be enough to solve all virtualization cases that we have discussed. Clearly, this can be a separate proposal, which is orthogonal and complementary to this.

@leobalter
Copy link
Member

For the registers, we had a sync about this sharableIdentity API at SES Meeting on Wednesday and it's got a generally positive feedback.

I'm looking forward for @domenic's feedback and hopefully we can set a working path forward.

@erights
Copy link
Collaborator

erights commented Feb 5, 2021

Hi @leobalter , the only email address I have for you no longer works. Assuming you have one for me, please send me your's as well as the recording of the last SES session. Thanks.

@littledan
Copy link
Member

Could Symbols as WeakMap keys provide this shareable identity concept?

@Jamesernator
Copy link

Could Symbols as WeakMap keys provide this shareable identity concept?

Assuming they can pass through and be returned by realm.eval() they should be of identical power as one could always simulate Object.createSharableIdentity() with them or vice versa.

The main advantage I see for Object.createSharableIdentity() is that it would be a little easier to discriminate them from other symbols that pass through .eval().

@domenic
Copy link
Member Author

domenic commented Feb 8, 2021

I'm looking forward for @domenic's feedback and hopefully we can set a working path forward.

I'm glad to hear that this proposal is being taken seriously, and could work for at least some of the champions! As I said, it could work for me and the constituencies I represent. (Remaining major issues remain, such as #261, but it at least is a large step in the right direction.)

I'll caution that @syg and I are still working to understand whether Chrome security is comfortable shipping sync/in-process realms in any form. (It turns out, they do not agree with my statement "This encapsulated-by-default proposal would bring realms onto the same footing as other encapsulation proposals such as trusted types or private fields, and thus make it more congruent with web platform goals.") This may be ameliorated with a name change, e.g. InsecureRealm or SideChannelAttackableRealm, but it's too soon to make any concrete statements. We're continuing to push for resolution internally.

Regarding shared identity things of the sort you discuss, I think exploring such things makes sense as a follow-on. @syg would be the best point of contact there, especially given his work on disjoint object graphs for cross-realm concurrency, which seems pretty related.

in the original proposal, error propagation was supposed to happen at the agent level (root realm with DOM semantics), what will be the proper way for the owner of the realm to deal with errors? including synchronous error (when calling eval), or errors triggered on the next turn? similarly, what can the realm do if the handler throws?

I don't have any concrete ideas or opinions here, but I'll point out that Error objects and subclasses are structured-cloneable, so I think most any semantics could work, as long as the error values get structured cloned on the way in or out of the realms.

@caridy
Copy link
Collaborator

caridy commented Feb 8, 2021

Few more notes:

  1. from the champions, and SES, we are now all aligned on the hard boundary between realms, and we will actively look into the other details.
  2. structured cloning seems to be a point of contention for the group, and we are actively looking into alternatives as well as trying to articulate why it is a problem. One idea is to just allow primitive values to be passed across the boundary, assuming that eventually we might get records and tuples to be able to pass complex structures across the boundary.
  3. Symbols as weakmap keys can solve the membrane use-cases (and will not require defining something new like Object.createSharableIdentity), and I plan to work with @littledan on this topic considering that he is the champion on the existing proposal to support this.

@leobalter
Copy link
Member

leobalter commented Feb 8, 2021

@annevk how this proposal alternative looks to you? I believe the existing Realms proposal has some ongoing issues that might be mitigated from @domenic's proposal.

If it goes through a positive path - including for @caridy's additional work - I'd quickly start updating the explainer and spec text here and preparing a new one for createSharableIdentity.

@leobalter
Copy link
Member

I see that investing in the symbols as weakmap keys would be a better solution and investment for this. I'll get this sync'ed with @littledan and @caridy.

@annevk
Copy link
Member

annevk commented Feb 9, 2021

It removes all the cross-realm concerns I had. I think saying it's comparable to Trusted Types is overselling it since that would ensure that the code that is run is actually vetted, which is not the case here. It's not a security mechanism, it's a way to run code encapsulated from global state, and if you don't trust the code you're still putting yourself at risk.

@erights
Copy link
Collaborator

erights commented Feb 9, 2021

It's not a security mechanism

I see this slogan causing a lot of confusion. We need better distinctions. See
https://agoric.com/taxonomy-of-security-issues/

Realms are an integrity mechanism. They are certainly not a confidentiality or availability mechanism.

if you don't trust the code you're still putting yourself at risk

I don't trust the code I wrote yesterday. We manage and mitigate risks. We need to do better at that. We never eliminate risks.

@Jamesernator
Copy link

Jamesernator commented Feb 10, 2021

It removes all the cross-realm concerns I had. I think saying it's comparable to Trusted Types is overselling it since that would ensure that the code that is run is actually vetted, which is not the case here. It's not a security mechanism, it's a way to run code encapsulated from global state, and if you don't trust the code you're still putting yourself at risk.

Realms are an integrity mechanism. They are certainly not a confidentiality or availability mechanism.

I would point out that Realms, by themselves, are not a confidentality mechanism. But they are still a practically† necessary part of any JS confidentality mechanism. I believe, but @erights would need to confirm, that the SES proposal is what is necessary for confidentality.

Availability is covered by neither this Realms proposal or the SES proposal, but rather would need to be covered by an Agent proposal (which has often been mentioned in passing in the various realms/compartments/ses proposals, but nothing concrete has been proposed thus far). Most hosts already give a way to create agents (e.g. Worker) so this is presumably lower priority, as one can just use SES realms inside of a host-provided agent.

† Strictly speaking SES could be implemented purely with the root realm and compartments, but lockdown permanently changes the realm in ways incompatible with a lot of existing code (for example frozen intrinsics, removal of all stateful/non-deterministic APIs such as Date.now() or Math.random(), etc).

@ljharb
Copy link
Member

ljharb commented Feb 23, 2021

@leobalter max/min is a philosophy i think many of us consider was a mistake in ES6. It doesn't make sense to me to add 2 out of the 4 Function globals, that's just creating inconsistency.

Like @littledan, I also think it's critical we have mechanisms that are not just evalling strings.

@leobalter
Copy link
Member

I believe we should avoid unnecessary complexity.

@ljharb

The reason I see for adding [Async]Generator functions is just because they are other functions formats. We are not discussing what is available in the new realms, but the channels we need to operate with this Realm. Is there any use case that needs iterators through these channels?


@littledan

As I mentioned somewhere in the past, I'm in full support for module blocks and I still believe this example should not use them.

We already have a long time baggage of challenges for this proposal, and I'd rather go through a solution that is compelling enough without another proposal that has its own challenges ahead.

As @caridy has mentioned, connect forces async mechanism, which is a complicate trade-off over CSP handling.

b) imposes a protocol that relies on the export names defined in the module, which is new.
c) it requires module blocks to be available to do anything useful with it.

I believe these concerns can be mitigated if we use the specifier instead of a module block in the example.

I simplified the names and arrow functions to support my own reading of the example.

// connector.js
export default function(fn) {
  return v => fn(v + 1);
};

// main.js
let n, send;
const r = new Realm;
await r.connect(
  function(sender) {
    send = sender; // The original example had the other way around, sender = send
    return val => { n = val; };
  },
  './connector.js'
);

send(4);
console.log(n); // 5

This example looks better, and the names should be distinct enough to avoid confusion. It still seems to be only one function per connecter, along with one new async tick.

Can you help me with the steps here, like where the sender function is created, and how the returned val => { n = val; } goes. Same for what are the steps when I call send(4)?

I'm assuming we have internals connecting these functions but I'm getting lost every time I try to create a step by step.

@caridy
Copy link
Collaborator

caridy commented Feb 23, 2021

GeneratorFunction and AsyncGeneratorFunction

@ljharb I think you're right... if those can be created from syntax, and can work as bridge functions, then I don't see why not exposing them to have a complete API on the Realm.

@leobalter
Copy link
Member

FWIW, I'm happy to add these extra bridge functions if the concern is a dealbreaker.

@littledan
Copy link
Member

Yeah, I can see how it is a problem that creating a realm and a communication channel is an async operation in my suggestion. At the moment, I can't think of a solution which both resolves the no-eval issue and is synchronous. I will keep thinking on this issue.

@Jamesernator
Copy link

Jamesernator commented Mar 1, 2021

Yeah, I can see how it is a problem that creating a realm and a communication channel is an async operation in my suggestion. At the moment, I can't think of a solution which both resolves the no-eval issue and is synchronous. I will keep thinking on this issue.

Perhaps we could have a "script block" that creates unevaluated scripts similar to module blocks e.g.:

const someScript = script {
  console.log("Hello");
}

Also thinking about your .connect idea, it could be simplified to simply returning the realm-wrapped function (or a Promise for it in the case of modules), rather than doing the awkward callback thing:

// Synchronous for scripts, this would be like .eval, but wraps the returned
// function for the current realm, so addOne !== the lambda inside the script
const addOne = realm.connectScript(script {
  (n) => n + 1;
});

// Asynchronous for modules, this would act like .import, but the default export
// is treated as a function to wrap, the addTwo in this realm is not the same as
// the addTwo within the realm
const addTwo = await realm.connectModule(module {
  export default function addTwo(n) {
    return n + 2;
  }
});

@leobalter
Copy link
Member

leobalter commented Mar 25, 2021

Isolated Realm API changes

This comment describes a possible solution for the API of the Realm to work with the new isolation model described in this issue.

API (in typescript notation)

declare class Realm {
    constructor();
    eval(sourceText: string): PrimitiveValueOrCallable;
    importBinding(specifier: string, bindingName: string): Promise<PrimitiveValueOrCallable>;
}

Skipping relaxed CSP

In this API review, the import method becomes importBinding (name open for bikeshed). Allowing injecting code in the constructed realm while getting a binding value. The value is restricted to primitives, while it would also auto wrap a connecting function. There isn't any need to provide any argument to the Realm constructor and importBinding would have a meaningful promise resolution.

const realm = new Realm();

await realm.importBinding('console-shim', 'default');

const redTrySample = await realm.importBinding('sampler', 'trySample')

// redTrySample can still receive primitives such as symbols, etc

const result = redTrySample(2, 3);

The wrapped functions can receive functions as arguments. This allows the constructed realm to trigger a callback in the incubator realm, without knowing about the incubator realm.

const realm = new Realm();

const redRunTests = await realm.importBinding('testFramework', 'runTests');

function reportResults(...args) {
    /* ... manages results from args ... */
}

reportResults.noop = 1;

// The constructed realm receives a new function that would chain to
// reportResults when called with its given arguments.
redRunTests(reportResults);

// The connecting function created inside the realm won't have any access to
// the property 'noop'. It does not receive a strucuted clone of that function.

The (other) basics

This API explores a modification of the Isolated Realms proposal while trying to preserve some of its goals. It has similar level of expressivity and isolation. It still disallows direct access between the parent and child realms, but it does not use structured cloning. This is possible through auto wrapping connected functions.

In this API, any action hitting a disallowed completion would throw an exception. The only values that can be transfered are primitives (string, number, boolean, symbol, undefined, null, BigInt, etc). There is a special behavior to wrap callable objects, generally functions. By callable objects, we consider any object with a [[Call]] capability. When the API evaluates a callable object completion, it should create an internal reference to it and a new function in the other realm that would receive this completion. When that new function is called, it chains the call to the reference in the different realm transfering the given arguments. These arguments share the same restriction to only allow primitives and callable objects.

const realm = new Realm();

try {
    realm.eval("[1, { foo: 'bar' }]");
} catch {
    // Throws a TypeError
}

// If you try to get access to a constructed realm's constructor,
// you get a wrapped function, which isn't very useful as it can't return
// object values:

const redArray = realm.eval("Array");

// redArray is only another function that would eventually chain the call to the
// Array constructor in the other realm.

try {
    redArray();
} catch (err) {
    assert(err.construcor === TypeError);
}

// The wrapped functions are always frozen and do not share properties!

assert(Object.isFrozen(redArray));
assert(redArray.__proto__ === Function.prototype); // not from the other realms prototype

The wrapped functions allow setting values in the other realm including Symbols while providing more API flexibility.

const realm = new Realm();
const mySymbol = Symbol();
const fn = realm.eval(`(function(x) { globalThis.foo = x; })`);

fn(mySymbol); // equivalent to the previous realm.set("foo", "bar");

const result = realm.eval('globalThis.foo');

assert(result === mySymbol);

The wrapped functions does some sugar for the previous realm.call, while avoiding fingerprints in the inner realm globals:

const realm = new Realm();
const add = realm.eval(`(x, y) => x + y`);

const result = add(2, 3); // equivalent to the previous realm.call("add", 2, 3);

assert(result === 5);

The avoided fingerprints also remove a need for such things as globalThis.callParent() in the constructed realms.


There are more details in this README file.

@Jack-Works
Copy link
Member

From the developer's perspective, I think the best way is to make the membrane default and allowing opt-out.

  1. Membrane by default:
(await new Realm().import("./val")).array instanceof Array
// true

Membranes by default can avoid mysterious behaviors for simple use cases.

2 Opt-out if they actually don't need a membrane

(await new Realm({ membrane: false }).import("./val")).array instanceof Array
// false

@caridy
Copy link
Collaborator

caridy commented Mar 25, 2021

@Jack-Works let's keep the membrane separate, that's not a direct goal of the Realms proposal. At some point we might propose some membranes specific proposal that can complement this proposal.

@caridy
Copy link
Collaborator

caridy commented Mar 25, 2021

@leobalter thanks for the write-up, that's a lot of info to digest, let me try to provide some high level thoughts that might help to understand what we have been discussing for few weeks:

  1. The primary goal is to make the realms isolated (as described in the description of this issue).
  2. Ergonomics is another important goal, that translates to:
    a) how easy is to transition code that is running in iframes, in Node's VM Module, inside the 8-magic lines of code, etc. to rely on a more robust solution using Realm.
    b) how easy it is to understand and use this API (considering that this is a fairly low level API).
  3. How to fulfill the use-cases that the Realm proposal was set to resolve.

From that, we are of the opinion that the coordination between the two realms can be done via Primitive Values plus Callable Values, and this will allow users to implement their own protocol, hence the proposed API.

The implementation seems to be simple enough, the following is a very early draft of the spec for that:

GetWrappedValue ( ReceiverRealm, value )

The CrossRealmValue abstract operation takes arguments ReceiverRealm and value, it performs the following steps:

1. If IsPrimiteValue(value) is true, return value.
1. If IsCallable(value) is false, throw a TypeError exception.
1. Return ? CreateWrappedCallable(ReceiverRealm, value).

CreateWrappedCallable ( ReceiverRealm, value )

The CreateWrappedCallable abstract operation takes arguments ReceiverRealm and value, it performs the following steps:

1. Assert: IsCallable(value) is true.
1. Let F be a new built-in function object associated to ReceiverRealm as defined in Wrapped Functions.
1. Set F.[[OriginalCallable]] to value.
1. Return F.

Wrapped Functions

A wrapped function is an anonymous built-in function that has a [[OriginalCallable]] internal slot.

When a wrapped function F is called with arguments, the following steps are taken:

1. Let foreignCallable be F.[[OriginalCallable]].
1. Assert: IsCallable(foreignCallable) is true.
1. Let foreignRealm be F.[[Realm]].
1. Let currentRealm be the current Realm Record.
1. Let argList be ? CreateListFromArrayLike(arguments).
1. For each element key of argList, do
   1. Let o be argList[key].
   1. Set argList[key] to ? GetWrappedValue(foreignRealm, o).
1. Let value be ? Call(foreignCallable, undefined, argList).
1. Return ? GetWrappedValue(currentRealm, value).

With that in mind, both Realm.prototype.eval and Realm.prototype.importBinding will rely on GetWrappedValue before returning or resolving the promise to the incubator realm, the rest just works from there since each realm will only be able to see wrapped functions from the other side.

It is important to notice that a wrapped function does have identity (associated to the realms that they belong to), but have no way to trace them back to the original callable.

We also see some intersection semantics with Records and Tuples in the sense that those will allow complex structures (without identity) to be shared between the two sides (under the same process).

We hope to open the conversation about this option, as it seems to solve the ergonomic issues related to previous proposals from @domenic, @littledan and myself.

/cc @erights @syg

@rwaldron
Copy link
Collaborator

@caridy I believe that what you've described here would be sufficient as a drop-in replacement for the various realm creation mechanisms that eshost uses in each JS engine host. Currently, eshost creates realms with whatever the host provides, as you mentioned, vm in node, iframe in browser, and other engine host specific mechanisms (they're different in each of the major implementations).

@ByteEater-pl
Copy link

eval(sourceText: string): PrimitiveValueOrCallable

But what about the desirable feature of enabling synchronous execution in a realm without passing code in a string?

@caridy
Copy link
Collaborator

caridy commented Mar 25, 2021

eval(sourceText: string): PrimitiveValueOrCallable

But what about the desirable feature of enabling synchronous execution in a realm without passing code in a string?

@ByteEater-pl you will have to go the module way, which is not sync, that's what Realm.prototype.importBinding is for, if you combine that with import blocks, it is going to be very compelling but async.

@ljharb
Copy link
Member

ljharb commented Mar 25, 2021

Why is there no non-eval synchronous mechanism? (especially considering one could be created trivially, as long as one is willing/able to wait a tick to set it up)

@leobalter
Copy link
Member

There is already a suggestion on creating a method to inject code like executeScript(specifier) but I believe that might remains simple in the ECMAScript aspects but needs more investigation in the host integrations including HTML. I don't think that's trivial.

The importBinding is good enough as a starting point that unblocks most of the use case, without requiring a new loader mechanism.

I'm good with having a script injection method, but I'd do it as a follow up. Unblocking this proposal will also be helpful to clear out the next steps.

@caridy
Copy link
Collaborator

caridy commented Mar 25, 2021

Why is there no non-eval synchronous mechanism? (especially considering one could be created trivially, as long as one is willing/able to wait a tick to set it up)

I'm of the opinion that we can try to add that mechanism to the language in general, not just the Realm proposal.

@ByteEater-pl
Copy link

ByteEater-pl commented Mar 26, 2021

I'm of the opinion that we can try to add that mechanism to the language in general

What do you mean?

@leobalter
Copy link
Member

I'm of the opinion that we can try to add that mechanism to the language in general

What do you mean?

@ByteEater-pl This is a mechanism not yet existent in the ECMAScript field, maybe through other extension APIs. In my opinion if TC39 wants to explore code injection that is different than modules, we should have a larger discussion that is out of the bounds of the Realms proposal. There is more to discuss how we want to inject code and what it means in the language and hosts integration. This should not block this proposal as the main goals are still resolved with the current form.

@ByteEater-pl
Copy link

But at least as a rough sketch, what do you mean by code injection in a broader context? Injection into what? Something other than Realms? They are the abstraction present in the spec, and both new language features and host integrations are in each case defined in terms of them. Do you believe there are scenarios for which some other abstraction would need to be added, bypassing Realms, and can such vision at present be made tangible enough to warrant choosing that alternative direction as opposed to first trying to support those scenarios by adding features to Realms (possibly even with this very proposal)?

Or maybe I'm missing your point, entirely or partially. If so, I'm sorry as a non-native English speaker for being possibly more demanding of your patience than I realized.

@caridy
Copy link
Collaborator

caridy commented Jul 21, 2021

@leobalter I believe this can be closed now since it is now codified as part of the callable boundary effort. I also want to thank @domenic for suggesting these changes, it turned out great IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests