Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2918: Refresh tokens #2918

Merged
merged 22 commits into from
Sep 28, 2021
Merged
Changes from 5 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
ab50b62
Refresh tokens MSC
sandhose Dec 18, 2020
f8dad2a
MSC2918: minor changes
sandhose Jan 14, 2021
0e615f7
MSC2918: access token expiration as milliseconds
sandhose May 20, 2021
870cded
MSC2918: account registration API changes
sandhose May 20, 2021
6530ecc
MSC2918: fix `expires_in_ms` example
sandhose May 20, 2021
b320001
MSC2918: add precision about token revocation
sandhose Jun 3, 2021
d433e3b
MSC2918: specify error codes for the refresh API
sandhose Jun 3, 2021
87566c3
MSC2918: clarify that the change also applies to ASes
sandhose Jun 3, 2021
269fcac
Apply suggestions from code review
sandhose Jul 1, 2021
4d73b7e
MSC2918: clarify what problem this MSC solves
sandhose Jul 15, 2021
db8ceab
MSC2918: minor formatting and rephrasing
sandhose Jul 15, 2021
9bbb4c5
MSC2918: clarify ratelimiting, masquerading and authentication on ref…
sandhose Jul 15, 2021
a050dc3
MSC2918: make expires_in_ms/refresh_token optional
sandhose Jul 15, 2021
2c11e6f
MSC2918: soft logout in refresh token API
sandhose Jul 15, 2021
4cd94e3
MSC2918: add detailed rationale
sandhose Aug 12, 2021
04ae1c3
MSC2918: minor fix
sandhose Aug 12, 2021
488e9e1
MSC2918: clarifications on backward compatibility
sandhose Sep 9, 2021
4cf821c
MSC2918: advertise support in the request body
sandhose Sep 10, 2021
c076763
MSC2918: clarify on what happen when token expire
sandhose Sep 10, 2021
a157cc3
MSC2918: remove redundant precision about token expiration and lifetime
sandhose Sep 23, 2021
ed54213
MSC2918: minor clarification
sandhose Sep 23, 2021
70b2dfc
MSC2918: soft logout when using expired token
sandhose Sep 23, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions proposals/2918-refreshtokens.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# MSC2918: Refresh tokens
richvdh marked this conversation as resolved.
Show resolved Hide resolved
turt2live marked this conversation as resolved.
Show resolved Hide resolved

Requests are currently authenticated using non-expiring, revocable access tokens.
sandhose marked this conversation as resolved.
Show resolved Hide resolved
This goes against security best practices known in the OAuth 2.0 world.
This MSC make the access tokens expiring and introduces refresh tokens to renew them to fight against token replay attacks.
sandhose marked this conversation as resolved.
Show resolved Hide resolved

## Proposal

The access token returned by the login endpoint expires after a short amount of time, forcing the client to renew it with a refresh token.
uhoreg marked this conversation as resolved.
Show resolved Hide resolved
A refresh token is issued on login and rotates on each usage.

Homeservers can choose to make the access tokens signed and non-revocable for performance reasons if the expiration is short enough (less than 5 minutes).
turt2live marked this conversation as resolved.
Show resolved Hide resolved

### Login API changes
richvdh marked this conversation as resolved.
Show resolved Hide resolved

The login API returns two additional fields:
clokep marked this conversation as resolved.
Show resolved Hide resolved

- `expires_in_ms`: The lifetime in milliseconds of the access token.
sandhose marked this conversation as resolved.
Show resolved Hide resolved
- `refresh_token`: The refresh token, which can be used to obtain new access tokens.
sandhose marked this conversation as resolved.
Show resolved Hide resolved


### Account registration API changes

Unless `inhibit_login` is `true`, the account registration API returns two additional fields:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be if its false? I.e. we include the params when we return a valid access token?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec says:

inhibit_login: If true, an access_token and device_id should not be returned from this call, therefore preventing an automatic login. Defaults to false.

So that sentence seems correct? If inhibit_login is true, it will not return the additional fields

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not the easiest thing to grok, but @sandhose is right, and @erikjohnston is confused.


- `expires_in_ms`: The lifetime in milliseconds of the access token.
- `refresh_token`: The refresh token, which can be used to obtain new access tokens.
richvdh marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the refresh token expire? How does one manually expire it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also would be interested to see how this plays with soft logout: if the access token expires, but the refresh token is still live, should the server be using soft_logout: true in expiration responses? If so, how should the server clean up the device once the refresh token also expires?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refresh token don't expire, but they get invalidated on use.
By the way, the current implementation in Synapse is incompatible with the session_lifetime config parameter, which was the one that made access token expire and led to soft logouts.

also would be interested to see how this plays with soft logout: if the access token expires, but the refresh token is still live, should the server be using soft_logout: true in expiration responses?

I was not aware of how soft_logout works. :)
But also clients should now be aware when the token expires, so soft_logout on access token expiration should not happen as often, but then I'd clarify that soft_logout also applies when using the /refresh endpoint

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refresh token don't expire, but they get invalidated on use.

This particular piece is a bit concerning, as it means that refresh tokens are hanging around waiting to give access back to the account. On the other hand, this somewhat fixes the scripts usecase as it can then store the refresh token and use that on the next run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still heavily mitigates the impact of token leakage, since they are rotating.
This is what a security best practices for OAuth 2.0 document from the IETF says about refresh tokens:

*  *Refresh token rotation:* the authorization server issues a new
      refresh token with every access token refresh response.  The
      previous refresh token is invalidated but information about the
      relationship is retained by the authorization server.  If a
      refresh token is compromised and subsequently used by both the
      attacker and the legitimate client, one of them will present an
      invalidated refresh token, which will inform the authorization
      server of the breach.  The authorization server cannot determine
      which party submitted the invalid refresh token, but it will
      revoke the active refresh token.  This stops the attack at the
      cost of forcing the legitimate client to obtain a fresh
      authorization grant.

tl;dr: if there is an attempt to use an old refresh token, there might be a token leak somewhere and the whole session should be invalidated. This could be mentioned in the MSC and/or in the spec, and implemented in Synapse if you thing it makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also clarified about soft_logout in the refresh token API in 2c11e6f

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ha, so the server would revoke the access token when a refresh token is used twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, although not strictly enforced by this MSC (and the current implementation in Synapse does not do that)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@turt2live is this clear enough now?


### Token refresh API

This API lets the client refresh the access token.
A new refresh token is also issued, and the existing one is revoked.
The Matrix server doesn't have to make the old access token invalid, since its lifetime is short enough.

`POST /_matrix/client/r0/refresh`
turt2live marked this conversation as resolved.
Show resolved Hide resolved
turt2live marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"refresh_token": "aaaabbbbccccdddd"
}
```

response:

```json
{
"access_token": "xxxxyyyyzzz",
"expires_in_ms": 60000,
"refresh_token": "eeeeffffgggghhhh"
}
```

### Device handling

The current spec states that "Matrix servers should record which device each access token is assigned to".
This must be updated to reflect that devices are bound to a session, which are created during login and stays the same after refreshing the token.

uhoreg marked this conversation as resolved.
Show resolved Hide resolved
## Potential issues

The refresh token being rotated on each refresh is strongly recommended in the OAuth 2.0 world for unauthenticated clients to avoid token replay attacks.
This can however make the deployment of CLI tools for Matrix a bit harder, since the credentials can't be statically defined anymore.
This is not an issue in OAuth 2.0 because usually CLI tools use the client credentials flow, also known as service accounts.
An alternative would be to make the refresh token non-rotating for now but recommend clients to support rotation of refresh tokens and enforce it later on.

## Alternatives

This MSC defines a new endpoint for token refresh, but it could also be integrated as a new authentication mechanism.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to expand on this, and the "potential issues" section above, what are the concerns with introducing it as some form of opt-in (or opt-out) mechanism for things like long-lived bots or scripts which do not easily have a refresh opportunity? For example, a nightly batch job to prune rooms/events/etc could use a static access token instead of having to login, do the work, then log out again, which would put the password near the script rather than a single revocable token.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for both use cases (bots and scripts) I'd rather make use of the org.matrix.login.jwt login type with some KMS signing it for the initial login and still have the token refresh. Storing long-lived access tokens without proper secret handling is at least as bad as storing the login/pass of the bot IMO, especially if the user has admin access.
If we still need some kind of static access token, I'd rather have that in Synapse (in the config or something) than in the spec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, that sounds reasonable. Just wanted to expand on the potential usecase, but agreed that scripts can find other ways to authenticate (or better yet: be replaced by features within the protocol/homeserver implementation)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an authentication option for scripts needs to be in the spec. I have a lot of scripts that push notifications or upload files from CI jobs for example. Those use access tokens, because CI jobs do sometimes get compromised (happened once because of codecov) and that way the access token can be easily rotated without being a homeserver admin. If the script used username and password instead, an attacker would have been able to get past UIA and change the password and just in general do much more nasty stuff than with an access token. The jobs also can't refresh the access token, since they may be running concurrently and can't change CI variables.

What would be my alternative for that use case, that works independent of the specific homeserver implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an ongoing effort to rework the whole authentication process, with use cases like scripts running in CI in mind. This MSC is also done to prepare clients for the eventual migration to this new authentication stack without having them to logout all their existing sessions.

The login API with non-expiring token will hopefully stay until this new auth stack is ready, so when you would need to migrate you will have a proper alternative.

In the meantime, if you want to still adopt refresh tokens and you are admin of your homeserver, I suggest you look into the org.matrix.login.jwt login type in Synapse. Even though it is not standard, it will let you login using a JWT signed by some party.
It might still need some changes in Synapse to allow restricting the tokens (like not being able to use them for UIA to avoid letting the account to be nuked), but I prefer to go that route rather than adding this kind of special cases in this MSC if it will be superseded by something else soon-ish.


## Security considerations

The time to live (TTL) of access tokens isn't enforced in this MSC but is advised to be kept relatively short.
Servers might choose to have stateless, digitally signed access tokens (JWT are good examples of this), which makes them non-revocable.
The TTL of access tokens should not exceed 15 minutes if they are revocable and 5 minutes if they are not.

## Unstable prefix

While this MSC is not in a released version of the specification, clients should add a `org.matrix.msc2918.refresh_token=true` query parameter on the login endpoint, e.g. `/_matrix/client/r0/login?org.matrix.msc2918.refresh_token=true`.
sandhose marked this conversation as resolved.
Show resolved Hide resolved
richvdh marked this conversation as resolved.
Show resolved Hide resolved
The refresh token endpoint should be served and used using the unstable prefix: `POST /_matrix/client/unstable/org.matrix.msc2918/refresh`.