Implement labels for examples #1066

tiziano88 · 2020-06-02T15:46:00Z

Similar to #972, but for the rest of the examples (which presumably will need simpler labels).

ipetr0v · 2020-06-29T12:32:18Z

Are hash-based labels already supported in Oak?

conradgrobler · 2020-08-25T15:28:11Z

Regarding the use of signature labels in the TIR example, I don't think that hash-based labels or signature-based labels are appropriate for the TIR example, as it should use per-user labels to be secure.

If hash- or signature-based labels are used, the output from the node is public once the data leaves the node. If an untrusted node receives the output (rather than the gRPC server pseudo node) then the response can be sent anywhere, not only to the user who requested it.

ipetr0v · 2020-08-25T15:31:49Z

But if we will use per-user labels, will Oak be able to send requests to an external database?
The idea behind a signature label is that user trusts (based on signature) that this module will anonymize his data, before sending requests to the database.

conradgrobler · 2020-08-25T15:37:24Z

Sending a query to an external database is not appropriate for private information retrieval either. The typical use case for private/trusted information retrieval is to not leak the sensitive information that is required to do the query. If the sensitive information that should be protected is removed by the node during declassification, the query can no longer be based on this sensitive data.

My understanding was that the external database would be sent into the node (perhaps in batches if it is too big to be send all at once) that has the user data, rather than the node making a query.

tiziano88 · 2020-08-25T16:14:57Z

I think I agree with @conradgrobler .

@ipetr0v how do you think using a signature-based label helps with this example in particular? Perhaps could you describe the workflow you have in mind including who signs what, and why they should be trusted?

ipetr0v · 2020-08-25T16:28:05Z

Currently we have two modules in TIR:

the TIR itself
- It processes specific user requests
and a database_proxy
- Is a generic implementation of private information retrieval, that requests a specific database entry E without revealing it

Client sends requests for a specific database entry E to TIR, but client doesn't trust TIR module, because it's just a specific implementation created by a third-party. Client trusts the database_proxy (ex. that was signed by Google) to request E from a database. Thus, client assigns a Google's public key to its data, because only database_proxy is trusted to know E and request it from an external database without revealing it.

And database_proxy was created by Google in order for it to be used by multiple third-party applications, that want to benefit from private information retrieval.

conradgrobler · 2020-08-25T16:33:59Z

What labels are associated with TIR and database_proxy? Where does the output from database_proxy go?

ipetr0v · 2020-08-25T16:36:22Z

There are no labels for TIR and a public_key label for database_proxy.

database_proxy works as follows:

Some other module requests E from it
database_proxy sends multiple requests to an external database and iteratively searches for an entry corresponding to E
Then it returns the found entry to the module that asked for it

conradgrobler · 2020-08-25T16:38:58Z

Some other module requests E from it

In our case, is it TIR that requests it? If TIR has no labels, what is stopping it from leaking E or any other data received from the user?

ipetr0v · 2020-08-25T16:58:53Z

The idea behind database_proxy was to create a separate generic module, that doesn't depend on specific user requests and just provides a string id lookup in database (TIR just parses the response and sends it back to user).

In our case, is it TIR that requests it? If TIR has no labels, what is stopping it from leaking E or any other data received from the user?

I think once #1357 is implemented we would be able to add user token labels to a request, but it will also create the following problem:

Since database cannot fit into Oak, it needs to iteratively request chunks of data from the database
On each iteration it needs to search for E in it, and thus the module (that sends requests outside of Oak) needs to know E
But if we will add user token labels to E - the module will only be able to send data to the user and not to the database

I think in our case user needs to trust database_proxy's public key, and probably TIR's hash.
So the data label should be changed to hash(TIR) ∧ public_key(database_proxy)
It would mean that the user trusts a specific version of TIR, and it also can use any version of the database proxy signed by Google.

And TIR also will need to declassify the data (but I don't know if declassification is implemented already).

ipetr0v · 2020-08-25T17:20:45Z

In our case, is it TIR that requests it? If TIR has no labels, what is stopping it from leaking E or any other data received from the user?

Also, data is labeled with the public key, so IIUC only modules labeled with it can declassify this data.
And thus, only database_proxy will be able to declassify it.
(this is the example with public_key(database_proxy) label)

conradgrobler · 2020-08-26T07:40:20Z

There are two things that are unclear to me:

If TIR has no label it is public, so it cannot see data that is labeled with public_key(database_proxy). How does the data flow in the example?
Once database_proxy has declassified the results, the results are public. What stops other nodes from sending these public results anywhere else?

ipetr0v · 2020-08-26T10:17:34Z

If TIR has no label it is public, so it cannot see data that is labeled with public_key(database_proxy). How does the data flow in the example?

I think it can, it just cannot send it outside of Oak.

Once database_proxy has declassified the results, the results are public. What stops other nodes from sending these public results anywhere else?

In this case, it looks like we need to trust both modules (TIR and proxy) and use hash(TIR) ∧ public_key(database_proxy) label. Because we cannot label data with user tokens, since proxy will not be able to declassify requests.

tiziano88 · 2020-08-26T10:24:29Z

Let us start with a simpler case, in which the entire database is already in-memory in the node (no lookups to an external server).

@ipetr0v in this case, what is the label of the incoming data, and of the various nodes?

conradgrobler · 2020-08-26T10:24:54Z

I think it can, it just cannot send it outside of Oak.

IFC will stop data with a non-public label from being read by a node with a public label. A node with a public label can always send data outside of Oak.

In this case, it looks like we need to trust both modules (TIR and proxy).

This just moves the problem elsewhere. A malicious application owner could add another public untrusted module that then receives the declassified output from TIR and sends it elsewhere. The user connecting to the application has no way of knowing whether the output from TIR goes back to the gRPC server, or to some other node.

ipetr0v · 2020-08-26T10:38:41Z

in this case, what is the label of the incoming data, and of the various nodes?

Currently it's just public. I started to add a public key label for database_proxy, but it's not a final IFC to TIR, it's just an example of using public key labels.
But in the case when the whole database can fit in Oak, we don't need public key labels - we just need a user token label (because database is already there, and there is no need to make external requests).

Problems arise when we cannot fit the whole database (need to make requests) and also want to prohibit the result from being sent somewhere else.

ipetr0v · 2020-08-26T10:52:23Z

Also, the problem that @conradgrobler is describing probably may arise in any application, where developers try to use third-party modules for routines involving declassification.

ipetr0v · 2020-08-26T10:56:08Z

I think one possible solution to this problem will be to use disjunctions (#1207) ∨ in the label.

So the user assigned label would be token(user) ∨ public_key(database_proxy). This would allow database_proxy to declassify requests and also will prohibit any other node from sending data not back to the original user.

tiziano88 · 2020-08-26T13:40:39Z

I am not sure disjunctions would help, a disjunction used in a confidentiality label is strictly weaker than either of the original principals, right?

ipetr0v · 2020-08-26T13:41:57Z

But it would mean that either database_proxy or gRPC server/client sending data to client can declassify data, and no one else.

This change adds a signature label to Private Set Intersection example. Fixes #1344 Ref #1066

conradgrobler · 2020-11-09T15:28:50Z

What labels should we use with the other examples?

All of the label types are a bit inconvenient to maintain:

Wasm hash labels mean that the hash must be updated in multiple places in code every time the code is recompiled (e.g. when the SDK changes) and that the updated binary must be pushed to storage
Signature labels mean that the binary must be pushed to storage and the signature must be updates when the code is recompiled
Per-user labels mean that we need to have a router node that creates a new set of nodes per user

I think that per-user labels would require the least amount of ongoing maintenance, even though it requires more upfront work.

tiziano88 · 2020-11-16T16:42:42Z

What labels should we use with the other examples?

All of the label types are a bit inconvenient to maintain:

Wasm hash labels mean that the hash must be updated in multiple places in code every time the code is recompiled (e.g. when the SDK changes) and that the updated binary must be pushed to storage

Signature labels mean that the binary must be pushed to storage and the signature must be updates when the code is recompiled

Per-user labels mean that we need to have a router node that creates a new set of nodes per user

I think that per-user labels would require the least amount of ongoing maintenance, even though it requires more upfront work.

These are not interchangeable options, they have different meaning and security properties, we should not choose between them from a convenience point of view, rather based on what makes sense for the individual examples.

In particular, I think:

I expect Wasm hash labels will rarely be used in practice, unless there is a unanimous agreement that, e.g. a specific piece of functionality is implemented by a module with a specific Hash module. Even if / when this is the case, it is still more likely that such agreements will be published in the form of a signature over the hash itself, so a signature label probably make sense for that use case.
Signature labels are used in case in which we expect Wasm code to declassify data, which would happen only for self-contained pieces of logic that are manually reviewed, and these reviewes published by the appropriate verifiers (alongside their public keys). From the current set of examples, I think this only applies to:
- https://github.com/project-oak/oak/tree/main/examples/private_set_intersection
- https://github.com/project-oak/oak/tree/main/examples/aggregator
user labels may be used by anything that operates on per-user data, since these may only be declassified by the gRPC / HTTP server nodes, which are already trusted. Most applications would just rely on these, even if it means that we need to implement the router node pattern in more applications (which BTW it is mostly done already).

tiziano88 · 2020-11-16T17:54:47Z

@ipetr0v I think you are already looking into the remaining labels for some of the examples, so assigning this to you. There may be additional ones for the chat examples to be assigned once #1452 is fixed, which I can help looking into more closely, when the time comes.

ipetr0v · 2020-11-18T13:39:59Z

Since we probably will implement Router pattern in every module that uses labels - we also will need to split those modules, so that Router will not get declassification privileges.

tiziano88 · 2020-11-18T13:45:49Z

main and router can be in the same module. The module that does declassification will be in a separate one. I think this should be enough, and makes sense in terms of reusability.

ipetr0v · 2020-11-18T15:50:44Z

I also think that it makes sense to use Certificate labels for simple examples - since it's easier to maintain and it also captures the notion of updatable modules better.

ipetr0v · 2020-11-19T12:46:44Z

Also trusted_database as well as a lot of other examples require client public key labels?
IIUC right now they are only implemented for HTTP server, right?

tiziano88 · 2020-11-19T12:51:14Z

Correct, @rbehjati is adding support to gRPC

rbehjati · 2020-11-19T13:06:40Z

Yes. I'll add the authentication to the trusted DB example as @ipetr0v suggested, and I'll be done with the PR (#1707).

This change updates Trusted Database example to use Router pattern. Ref #1066

ipetr0v · 2020-11-20T16:28:31Z

I think, for hello_world and translator examples, it makes sense to create individual nodes per request and assign them client related labels (similar to how trusted_database works).
Because these examples do not perform declassification and don't require either hash or signature labels.

ipetr0v · 2021-01-25T12:27:25Z

I think the following commit (29822e2) was the last one for this issue.

tiziano88 added the P1 label Jun 2, 2020

tiziano88 mentioned this issue Jun 3, 2020

Wasm module files are not reproducibly buildable #865

Closed

daviddrysdale added this to the IFC v0 milestone Jun 27, 2020

ipetr0v mentioned this issue Aug 25, 2020

Use signature label in Private Set Intersection #1392

Merged

2 tasks

This was referenced Aug 28, 2020

Re-implement TIR example #1407

Closed

Move from minisign to ring #1387

Merged

tiziano88 mentioned this issue Sep 3, 2020

Tracking issue: IFC evolution #1431

Closed

14 tasks

ipetr0v added a commit that referenced this issue Sep 10, 2020

Use signature label in Private Set Intersection (#1392)

7d4553f

This change adds a signature label to Private Set Intersection example. Fixes #1344 Ref #1066

tiziano88 mentioned this issue Nov 16, 2020

Add with_privilege versions of node and channel function to ABI #1670

Closed

tiziano88 assigned ipetr0v Nov 16, 2020

ipetr0v mentioned this issue Nov 19, 2020

Use Router in Trusted Database #1748

Merged

ipetr0v added a commit that referenced this issue Nov 20, 2020

Use Router in Trusted Database (#1748)

a62ae89

This change updates Trusted Database example to use Router pattern. Ref #1066

ipetr0v closed this as completed Jan 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement labels for examples #1066

Implement labels for examples #1066

tiziano88 commented Jun 2, 2020

ipetr0v commented Jun 29, 2020

conradgrobler commented Aug 25, 2020

ipetr0v commented Aug 25, 2020 •

edited

Loading

conradgrobler commented Aug 25, 2020

tiziano88 commented Aug 25, 2020

ipetr0v commented Aug 25, 2020 •

edited

Loading

conradgrobler commented Aug 25, 2020

ipetr0v commented Aug 25, 2020

conradgrobler commented Aug 25, 2020

ipetr0v commented Aug 25, 2020 •

edited

Loading

ipetr0v commented Aug 25, 2020

conradgrobler commented Aug 26, 2020

ipetr0v commented Aug 26, 2020 •

edited

Loading

tiziano88 commented Aug 26, 2020

conradgrobler commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

tiziano88 commented Aug 26, 2020

ipetr0v commented Aug 26, 2020

conradgrobler commented Nov 9, 2020

tiziano88 commented Nov 16, 2020

tiziano88 commented Nov 16, 2020

ipetr0v commented Nov 18, 2020

tiziano88 commented Nov 18, 2020

ipetr0v commented Nov 18, 2020

ipetr0v commented Nov 19, 2020 •

edited

Loading

tiziano88 commented Nov 19, 2020

rbehjati commented Nov 19, 2020

ipetr0v commented Nov 20, 2020

ipetr0v commented Jan 25, 2021 •

edited

Loading

Implement labels for examples #1066

Implement labels for examples #1066

Comments

tiziano88 commented Jun 2, 2020

ipetr0v commented Jun 29, 2020

conradgrobler commented Aug 25, 2020

ipetr0v commented Aug 25, 2020 • edited Loading

conradgrobler commented Aug 25, 2020

tiziano88 commented Aug 25, 2020

ipetr0v commented Aug 25, 2020 • edited Loading

conradgrobler commented Aug 25, 2020

ipetr0v commented Aug 25, 2020

conradgrobler commented Aug 25, 2020

ipetr0v commented Aug 25, 2020 • edited Loading

ipetr0v commented Aug 25, 2020

conradgrobler commented Aug 26, 2020

ipetr0v commented Aug 26, 2020 • edited Loading

tiziano88 commented Aug 26, 2020

conradgrobler commented Aug 26, 2020 • edited Loading

ipetr0v commented Aug 26, 2020 • edited Loading

ipetr0v commented Aug 26, 2020 • edited Loading

ipetr0v commented Aug 26, 2020 • edited Loading

tiziano88 commented Aug 26, 2020

ipetr0v commented Aug 26, 2020

conradgrobler commented Nov 9, 2020

tiziano88 commented Nov 16, 2020

tiziano88 commented Nov 16, 2020

ipetr0v commented Nov 18, 2020

tiziano88 commented Nov 18, 2020

ipetr0v commented Nov 18, 2020

ipetr0v commented Nov 19, 2020 • edited Loading

tiziano88 commented Nov 19, 2020

rbehjati commented Nov 19, 2020

ipetr0v commented Nov 20, 2020

ipetr0v commented Jan 25, 2021 • edited Loading

ipetr0v commented Aug 25, 2020 •

edited

Loading

ipetr0v commented Aug 25, 2020 •

edited

Loading

ipetr0v commented Aug 25, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

conradgrobler commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Aug 26, 2020 •

edited

Loading

ipetr0v commented Nov 19, 2020 •

edited

Loading

ipetr0v commented Jan 25, 2021 •

edited

Loading