Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3819: Allowing widgets to send/receive to-device messages #3819

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
247 changes: 247 additions & 0 deletions proposals/3819-to-device-messages-for-widgets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
# MSC3819: Allowing widgets to send/receive to-device messages

Widgets (embedded HTML applications in Matrix) currently have a relatively large surface area
they can use for interacting with their attached client, primarily in the context of a room. They
can [send/receive events with MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762),
[navigate to rooms with MSC2931](https://github.com/matrix-org/matrix-spec-proposals/pull/2931),
and even [open dialogs with MSC2790](https://github.com/matrix-org/matrix-spec-proposals/pull/2790),
but they can't act as a whole other Matrix client just yet.

This MSC forms part of a larger, ongoing, question about how to embed other Matrix clients into another
client or room for access. An increasingly more popular client development option is to build out an
entirely new Matrix client and want to embed that within another client (as a widget) to avoid the
user needing to switch apps. To support this, we need to consider both long term and short term impact
of the changes we propose. This MSC aims closer to the short term.

A longer term solution to the problem of clients wanting to be embedded in other clients might still
be widgets, though with a system like [MSC3008](https://github.com/matrix-org/matrix-spec-proposals/pull/3008)
to restrict access to the client-server API more effectively. For this MSC's purpose though, we're
aiming to cover a specific subset of the client-server API: to-device messages.

While we could expose the entire client-server API over `postMessage` (or similar) for embedded
clients to access, the permissions model gets hairy and difficult to secure on the client side. Instead,
we're exploring what it would look like to special case what is needed for specific applications, as
needed, starting with to-device messages.

To-device messaging is described [here](https://spec.matrix.org/v1.2/client-server-api/#send-to-device-messaging)
with practical applications for widget-ized clients being implementations of
[MSC3401 - Native group VoIP](https://github.com/matrix-org/matrix-spec-proposals/pull/3401) for now.

## Prerequisite background

*Author's note: This is copied from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762).*

Widgets are relatively new to Matrix and so the terminology and behaviour might not be known to all
readers. This section should clarify the components of widgets that are applicable to this MSC without
going on a deep dive into widgets in general.

Widgets are embedded HTML/JS/CSS applications in a client which use the `postMessage` API to talk
to the client. This communication allows widgets to provide enhanced functionality such as sticker
pickers (when applied to a user) or performance dashboards (in rooms).

One of the first things that happens over this communication channel is a "capabilities negotiation"
where the client asks the widget what permissions it wants, and the widget replies with its ideal
set. The client then either decides or asks the user if the permissions requested are okay.

All communication over the channel is done in a simple request/response flow, using actions to
describe the request. For the capabilities negotiation, this would be the client sending the widget
a request with an `action` of `capabilities`, and the widget would respond to that request with a
response object.

The channel in which communication occurs is called a "session", where the session is "established"
after the capabilities negotiation. Sessions can only be terminated by the client.

The Widget API is split into two parts: `toWidget` (client->widget) and `fromWidget` (widget->client).
They are differentiated by where the request originates.

## Proposal

Inspired heavily from [MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762), we
introduce new capabilities for to-device messages:

* `m.send.to_device:<event type>` (eg: `m.send.to_device:m.call.invite`) - Used for sending to-device
messages of a given type.
* `m.receive.to_device:<event type>` (eg: `m.receive.to_device:m.call.invite`) - Used for receiving
to-device messages of a given type.

These capabilities open up access to the following respective actions, when approved:

**`fromWidget` action of `send_to_device`**

```json5
{
// This is a standardized widget API request.
"api": "fromWidget",
"widgetId": "20200827_WidgetExample",
"requestid": "generated-id-1234",
"action": "send_to_device", // value defined by this proposal
"data": {
// Same structure as the `/sendToDevice` HTTP API request body
"@target:example.org": {
"DEVICEID": { // can also be a `*` to denote "all of the user's devices"
"example_content": "put your real message here"
}
}
}
Comment on lines +78 to +85
Copy link
Member

@robintown robintown Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to include the event type somewhere in here. I'd suggest an interface of

{
  type: string;
  messages: { [userId: string]: { [deviceId: string]: unknown } };
}

Copy link
Member

@robintown robintown Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to control whether the event is encrypted or not

{
  type: string;
  encrypted: boolean;
  messages: { [userId: string]: { [deviceId: string]: unknown } };
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed an issue with the current implementation. The payload (unknown above) varies between encrypted: true and encrypted: false in Element Web right now. For unencrypted messages it is directly the content of the message, like {"key": "value"}, but for encrypted events it has to be wrapped like {"type": "event type…", "content": {"key": "value"} }. If that is really desired and not just a mistake, we should make sure to document the behavior.

I created an issue for it: element-hq/element-web#24470

}
```

The client upon receipt of this will validate that the widget has an appropriate capability to send
the to-device message. If the widget is approved for such a capability, the client **MUST** encrypt
the message by default unless the event is already encrypted by the widget (this MSC doesn't provide
enough API surface for a widget to do this, but in future it might be possible for the widget to
gain some context of the encryption state for the client and use that to make/manage Olm sessions).
The encrypted message is then sent as requested to the users/devices using
[`/sendToDevice`](https://spec.matrix.org/v1.2/client-server-api/#put_matrixclientv3sendtodeviceeventtypetxnid).

If the widget doesn't have appropriate permission, or an error occurs anywhere along the send path,
a standardized widget error response is returned.

Under the widget API, a response to all actions is required and takes the shape of repeating the
request with an added top-level `response` field. This `response` field is empty for this action,
as shown:

```json5
{
"api": "fromWidget",
"widgetId": "20200827_WidgetExample",
"requestid": "generated-id-1234",
"action": "send_to_device",
"data": {
"@target:example.org": {
"DEVICEID": {
"example_content": "put your real message here"
}
}
},
"response": {}
}
```

The client *should not* send a response to the action until the server has returned 200 OK itself,
which might take longer than the default widget API timeout of 10 seconds. Widgets should raise their
maximum timeout to 60 seconds or more for this action.

**`toWidget` action of `send_to_device`**

*Note*: It is common practice to name the action in favour of the direction of travel rather than try
and determine an alternative name. This does mean that there are two `send_to_device` actions: one
for widget->client and one for client->widget. This section is talking about client->widget.

After the client has decrypted all to-device messages it receives, it determines if any widgets should
be made aware of the contents within. The decrypted event type for the message is used to determine
if the widget has appropriate capability to see the message.

The client should process all to-device messages it can before sending them off to the widget. Even if
the client does process a message though, it should still send it to the appropriate widgets for
potential re-processing. This is to avoid a scenario where the host client can no longer reliably
function, such as if Olm sessions get corrupted or similar.

The client should be aware that to-device messages might be seen which the client *could* handle, but
might not have context on, such as VoIP signaling. The client should not error out if it can't locate
a matching call, for example.

The client SHOULD only send events which were received by the client *after* the session has been
established with the widget (after the widget's capabilities are negotiated).

The request itself looks as follows:

```json5
{
// This is a standardized widget API request.
"api": "toWidget", // note that we're sending *to* the widget here
"widgetId": "20200827_WidgetExample",
"requestid": "generated-id-1234",
"action": "send_to_device", // value defined by this proposal
"data": {
"type": "m.call.invite",
"sender": "@source:example.org",
"content": {
// ... as required for the event schema
}
Comment on lines +156 to +161
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a field to indicate whether the content was encrypted, so widgets can determine whether they trust the message's integrity.

Suggested change
"data": {
"type": "m.call.invite",
"sender": "@source:example.org",
"content": {
// ... as required for the event schema
}
"data": {
"type": "m.call.invite",
"sender": "@source:example.org",
"encrypted": true,
"content": {
// ... as required for the event schema
}

}
}
```

Note that the action only supports a single to-device message at a time. This is for symmetry with
[MSC2762](https://github.com/matrix-org/matrix-spec-proposals/pull/2762).

Under the widget API, a response is required from the widget. The widget simply acknowledges the request
with an empty response object:

```json5
{
"api": "toWidget",
"widgetId": "20200827_WidgetExample",
"requestid": "generated-id-1234",
"action": "send_to_device",
"data": {
"type": "m.call.invite",
"sender": "@source:example.org",
"content": {
// ... as required for the event schema
}
},
"response": {}
}
```

## Potential issues
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working with the current support of MSC3819 in Element we noticed a problem:

The send endpoint of the toDevice message API allows to target specific device
ids, but there is currently no way to know a specific device id for a widget.
Neither is it possible to embed a device id into the URL,
nor does a received toDevice message indicate the sender device to reply to it.
Sure, it is possible to target all devices via *, but using a specific device
id makes sending much more efficient.

Element Call, which this MSC initially was created for, is cheating a little bit.
As Element Call is a special widget anyway, it is created with
custom code instead of the normal widget API setup code.
This allows to embed additional parameters into the widget URL, including the device id.
From the perspective of bringing the product to the market quickly, I think it
is fine, but we should focus on getting this MSC into a complete state.

Therefore I suggest to amend small section here:

**Passing device id to widgets**

A new template variable is added to the available options for a widget URL:

* `device_id` - The device id of the client instance which is rendering the widget.

And extend the unstable prefix section with:

* `org.matrix.msc3819.device_id` in place of the proposed variable.

If this extension is desired, I'm open to add the implementation to the matrix-react-sdk and matrix-widget-api.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started working on implementing this addition over at matrix-org/matrix-widget-api#78

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In matrix-org/matrix-widget-api#78, we decided to slightly rename the parameter. Thus the amendment should be as follows:

**Passing device id to widgets**

A new template variable is added to the available options for a widget URL:

* `matrix_device_id` - The device id of the client instance which is rendering the widget.

And extend the unstable prefix section with:

* `org.matrix.msc3819.matrix_device_id` in place of the proposed variable.


Due to lack of documentation/spec, conventions for the widget API and its security principles could
be misunderstood or confusing. This MSC attempts to overly describe these cases where they are at
risk of being a potential misunderstanding, however readers of the proposal are still encouraged to
gather as much information as they can before reviewing this proposal.

This MSC further pushes forward an idea that the `postMessage` transport for the widget API is the
way to go, however MSCs like [MSC3009](https://github.com/matrix-org/matrix-spec-proposals/pull/3009)
explore what it could mean to have a different transport mechanism. This MSC is not tied directly
to `postMessage` and is instead describing the request/responses used over the widget API - whatever
transport that might be.

## Alternatives

As discussed in the introduction of this proposal and on other MSCs, we could expose the client-server
API more generically to the widget. This causes issues where the client is either forced to parse
requests like a webserver would to validate that the widget is allowed to make the request, or
require such a generic capability that widgets would excessively request full read/write access
from the user without consideration for the impact that might have. As such, we continue to describe
special-cased actions for the widget API on a case-by-case basis.

On other related proposals there's discussion about how a bot could achieve the same function as
the proposal. While also partially true here, the intent is not to have a game or similar publishing
events into a room but rather to have a second Matrix client (for all intents and purposes) embedded
either as a room widget or account widget. A bot precludes the second client from acting on behalf
of the user who has it open.

## Security considerations

Because the widget can implicitly decrypt events, it is absolutely imperative that clients
prompt for permission to use these capabilities even though the capabilities negotiation does not
require this to be done. Strictly speaking, clients which do not prompt for confirmation from the
user are frowned upon, however given the intended usecase of VoIP signaling it is reasonable to
auto-approve some capabilities if the client can verifiably trust the widget is running safe code.
In general, verifiable trust only comes from the client locking widgets down to specific domains
or rewriting the widget URL before rendering to something the client controls.

This MSC allows widgets to access sensitive parts of the client-server API, and the encryption
module specifically. If granted permission, a widget could feasibly harvest decryption keys *in clear
text*. It is strongly encouraged that clients do not auto-approve capabilities for key exchanges
or similar. In fact, it might even be reasonable for the client to auto-deny instead.

This MSC allows a room widget to act at the account level rather than the traditional room level.
Normally these events would be scoped to the currently active room, however to-device messages are
not tied to a room. Therefore, the events are exposed as-is to the widget and can be interacted with
as such.

## Unstable prefix

While this MSC is not present in the spec, clients and widgets should:

* Use `org.matrix.msc3819.` in place of `m.` in all new identifiers of this MSC.
* Only call/support the `action`s if a widget API version of `org.matrix.msc3819` is advertised.

## Dependencies

None applicable - this MSC's dependencies have either been approved or are used simply as reference
material. In practice, widgets should probably be formally in the spec before this MSC gets included.