feat: SMT Proofs (Inclusion and Exclusion) #648

bvrooman · 2023-12-20T12:28:58Z

Related issues:

Support SMT proofs generation #639

This PR:

Adds support for generating SMT proofs (inclusion and exclusion)
Adds support for verifying SMT proofs

Given an SMT, we can generate a proof for any key within the 32 byte key space. If the key is included in the tree, i.e., has been used in an update operation and not deleted, then we generate an inclusion proof. If the key has not been included in the tree, we generate an exclusion proof. Inclusion proofs verify that the given key-value is, in fact, in the tree. If the value undergoing verification does correspond to the given key, verification fails. Exclusion proofs verify that the key is a placeholder. If the given key is not a placeholder in the tree, verification fails.

Example

Consider the following example:

We have a key-value data set, upon which we are building a sparse Merkle tree.
The data set is structured as such:

Key	Value
000	42
010	"Foo"
110	"Bar"

The data set contains 3 key values. Because the key space supports up to 8 keys (000 - 111), we know that there are 5 unused slots in the key space.

For each unused slot, we know that the value is unset. Therefore, the complete key-value data set looks like this:

Key	Value
000	42
001
010	"Foo"
011
100
101
110	"Bar"
111

The corresponding sparse Merkle tree is constructed on the key-value data set:

Example Use Case

We can make statements about the data set. For example:

The key 000 has value 42 (inclusion). This statement is true.
The key 111 is excluded from the table (exclusion). This statement is true.
The key 100 is empty (exclusion). This statement is true.
The key 110 has value "Hello" (inclusion). This statement is false (the correct value is "Bar").
The key 110 is empty (exclusion). This statement is false (the key has value "Bar").
The key 101 has value "Foo" (inclusion). This statement is false (the key is empty).

We can categorize these statements into one of two categories:

I assert the key K has the value V (inclusion).
I assert the key K is unset (exclusion).

Assertion 1 is an assertion of knowledge: I assert that key K has value V. 1. For assertion 2, an equivalent assertion is: I assert that key K is not in the table.

For these assertions, we can generate cryptographic proofs. Respectively, these are proofs of inclusion and proofs of exclusion. Verification is the process of confirming a given statement using the proof. Verifying a statement and proof returns a boolean (true or false) output that can confirm the corresponding statement. If verification of the statement returns true, the statement has been proven. If verification of the statement returns false, the statement has not been proven.

Verification failure does not imply the opposite statement. For example, we cannot assert that a key is in the table without providing a specific value, i.e. "key K is included in the table (with some unknown value V)" using a failed exclusion assertion. If verification of the statement "key K has a zero value" returns false, it is possible that K has a non-zero value and is included in the tree; however, a failed proof of exclusion does not necessarily imply inclusion. This is because the proof provided to the verification can be somehow invalid or maliciously altered.

More concretely:

$V_{inclusion}(A_{key}, A_{value} = X, P) = true \implies A_{key} \in T, T[A_{key}] = X$
$V_{exclusion}(A_{key}, P) = true \implies A_{key} \notin T$
$V_{inclusion}(A_{key}, A_{value} = X, P) = false \centernot\implies A_{key} \notin T, T[A_{key}] \neq X$ (Proof may be invalid!)
$V_{exclusion}(A_{key}, P) = false \centernot\implies A_{key} \in T$

where:

$V$ is the verifier.
$A$ is the key-value pair in the statement undergoing verification.
$P$ is a proof that proves the statement undergoing verification
$T$ is the table of key-value data

Proofs that fail to validate do not imply the opposite statement.

MitchTurner · 2024-01-15T21:04:01Z

fuel-merkle/src/tests/sparse.rs

+    #[test]
+    fn generate_proof_and_verify_with_valid_key_value_returns_true((key_values, tree) in random_tree()) {
+        let mut rng = StdRng::seed_from_u64(0xBAADF00D);
+        if let Some((key, value)) = key_values.choose(&mut rng).cloned() {


I think this will always choose the same index. You could provide the random index in the test declaration.

The simplest way to do that would probably be just add a arb_num: usize and then in the code: let index = arb_num % key_values.len(), or something like that.

xgreenx

I think it is bad that all these functions panics.

If these functions are for internal purposes only, they should be private. But since we are exposing the Node type, maybe we want to rework how they are used.

Or, instead of exposing Node types, you can create another type for the ExclusionProof.

xgreenx · 2024-03-07T21:55:00Z

fuel-merkle/src/sparse/proof.rs

+            leaf,
+        } = self;
+        let key = key.into();
+        let mut current = *leaf.hash();


Almost fixed=D The problem is that getter leaf_key has an internal assert. So if someone provides a node, it will fail.

Added a test case with a panic=)

I think if the node is not a placeholder or a leaf, we need to return false. Overwise the proof provider may try to trick us=)

xgreenx · 2024-03-07T21:56:37Z

fuel-merkle/src/sparse/proof.rs

+pub struct ExclusionProof {
+    pub root: Bytes32,
+    pub proof_set: ProofSet,
+    pub(crate) leaf: Node,


I think it should be public and Node type as well. Otherwise, it is not possible to transfer this object between applications. We need to also implement serialization and deserialization somehow=)

Also, it would be nice if we implement serde on top of that

bvrooman · 2024-03-07T23:42:21Z

Almost fixed=D The problem is that getter leaf_key has an internal assert. So if someone provides a node, it will fail.

Added a test case with a panic=)

I think if the node is not a placeholder or a leaf, we need to return false. Overwise the proof provider may try to trick us=)

Yep, that's a good point.

The asserts should be debug_assert. I can fix that.
I will also add a check that the node in the proof is a leaf node/placeholder, and if not, return false.

I moved the test case you wrote into the proof.rs tests and made it deterministic, since we cannot guarantee that loading a node from the proof set will succeed in random a test setup.

I think it should be public and Node type as well. Otherwise, it is not possible to transfer this object between applications. We need to also implement serialization and deserialization somehow=)

Also, it would be nice if we implement serde on top of that

Maybe it makes sense to store a subset of node data in the proof, rather than the whole node? E.g., just leaf_key: Bytes32 and leaf_data: Bytes32, and we reconstruct the leaf from this using something like Node::new(0 /*height*/, Prefix::Leaf, leaf_key, leaf_data).

This way, we:

can easily implement serialization/deserialization on the proof structs (without also implementing it for Node)
ensure that there is no way for a user to provide an internal node (Prefix::Node) that breaks code like above
do not need to expose the Node structure further

edit:

Or, instead of exposing Node types, you can create another type for the ExclusionProof.

I think this you are suggesting something similar to what I am describing: a "subset type" of Node that can be used for just the ExclusionProof

# Conflicts: # CHANGELOG.md

bvrooman · 2024-03-11T00:43:58Z

Re: serde for proofs: #694

bvrooman · 2024-03-11T02:25:19Z

This PR is ready for the next round of reviews.

I'm leaving serde for a separate task, because that will require changes that are not in scope of proofs.

xgreenx · 2024-03-11T07:29:43Z

fuel-merkle/src/sparse/node.rs

    pub fn leaf_key(&self) -> &Bytes32 {
-        assert!(self.is_leaf());
+        debug_assert!(self.is_leaf());
        self.bytes_lo()
    }

    pub fn leaf_data(&self) -> &Bytes32 {
-        assert!(self.is_leaf());
+        debug_assert!(self.is_leaf());
        self.bytes_hi()
    }

    pub fn left_child_key(&self) -> &Bytes32 {
-        assert!(self.is_node());
+        debug_assert!(self.is_node());
        self.bytes_lo()
    }

    pub fn right_child_key(&self) -> &Bytes32 {
-        assert!(self.is_node());
+        debug_assert!(self.is_node());
        self.bytes_hi()
    }


I think public getters should return Option instead.

For internal usage, you can create a private version of this function. Maybe even it is better to mark them unsafe

Node is public only to the fuel-merkle crate, i.e. pub(crate) enum Node. The only place Node should ever be used is in the context of the MerkleTree, where we already know what kind of node we are using. There is no reason for it to be used in a context where we do not know the kind of node, such as user space or another crate, which is why it is marked pub(crate). Therefore, I do not agree that these should return Option - we always know what data we are accessing, i.e, we could always unwrap the Option; we would never need to branch based on the variant.

The debug_asserts are there for fuel-merkle developers who might modify the update/delete algorithms, not for consumers of the fuel-merkle crate. But if you prefer, I can mark these functions as pub(crate) as well to make that explicit.

I also disagree that these functions are unsafe, because it is always "safe" to call them, even in the wrong context. You cannot create an unsafe/incoherent state using these functions.

If they are used only within the crate, let's mark them pub(crate) to avoid confusion in the future=)

They are unsafe from a business perspective. I only brought attention to it because this commit shows that it is not always true. And from the API perspective, it would be right to have a separate bunch of functions only for internal MerkleTree work. And pub or pub(crate) functions that may fail outside of the MerkleTree implementation.

The unsafe for internal functions will show that it is dangerous to use. But the comment may describe why it is safe. Or we can always return an option and write expect and the reason why it is expected to be valid.

If they are used only within the crate, let's mark them pub(crate) to avoid confusion in the future=)

We can also use pub(in crate::sparse) to restrict this scope further. We could even rearrange the file structure to put Node under the merkle_tree module and do pub(in crate::sparse::merkle_tree) or really just pub(super) to make this very explicit. Any part of the interface marked pub is therefore also pub(in crate::sparse::merkle_tree) because of Rust's visibility rules.

They are unsafe from a business perspective. I only brought attention to it because this commit shows that it is not always true. And from the API perspective, it would be right to have a separate bunch of functions only for internal MerkleTree work. And pub or pub(crate) functions that may fail outside of the MerkleTree implementation.

I agree that we should differentiate between external and internal APIs, and that they can have different safety guarantees. At the moment, there is really only an internal API for Node. Putting Node in Proof violated that agreement by putting Node in user-space and mixing business logic with internal logic. That was bad! Node should only be concerned with internal logic, never business logic. Enforcing a more narrow visibility on Node will help reduce errors like that in the future.

To summarize: The interface was unsafe from a business perspective, but it was wrong to use it in business logic. I think isolating Node to internal logic is the first fix. If you still think unsafe should be used in this context, then I will add unsafe here.

Have a look at #696. I think this is a bit nicer: Here, Node is completely restricted to the merkle_tree module using pub(super), making it very explicit that the only consumer is merkle_tree, and that there are no consumers external to merkle_tree. This means that the API is strictly internal, and we can always analyze it from an internal perspective.

xgreenx · 2024-03-11T07:32:05Z

fuel-merkle/src/sparse/merkle_tree.rs

+            // divergence equates to this hash.
+            //
+            let leaf = actual_leaf.clone().into();
+            let exclusion_proof = ExclusionProof { proof_set, leaf };


I like your idea of returning the key and value in the case of the leaf. You can return an enum, which is either a placeholder or (key, value). And in the proof function, we can always create a leaf node.

xgreenx · 2024-03-11T07:32:23Z

fuel-merkle/src/sparse/proof.rs

+}
+
+#[derive(Clone, Eq, PartialEq)]
+pub struct ExclusionLeaf {


It should be an enum

xgreenx · 2024-03-11T07:33:12Z

fuel-merkle/src/sparse/proof.rs

+    }
+
+    fn is_placeholder(&self) -> bool {
+        self.leaf_value() == zero_sum()


We discussed it before, user may use any value, even zero sum. We need to use a separate variant for the case of the placeholder

Yes, but in this context, we are looking at the "exclusion leaf" (the nearest included leaf to the excluded key). The value of the exclusion leaf is either the hash of the user data for this leaf (not the user data itself), or placeholder hash. This is how this version allows us to use the zero sum for user data.

But, either way, we can use an enum here to simplify this.

It is true that, at this point, we have a hash of the user's value. But if the hash is zero, we will treat it as a placeholder, while it is not true. Enum fixes that case.

xgreenx · 2024-03-11T07:36:09Z

fuel-merkle/src/sparse/proof.rs

+        if !leaf.is_placeholder() && *leaf.leaf_key() == key.as_ref() {
+            return false;
+        }
+        let mut current = leaf.hash();


We want to process enum here. If the leaf is a (key, value), then we want to use Node::create_leaf(key, value). Otherwise we want to use Node::create_placeholder().

After you can call node.hash().

* Refactor * Refactor * Minor * Fix import * Revert file move * Fixes + comments * Use core deref instead of std * Group public and private functions

bvrooman · 2024-03-11T22:43:56Z

fuel-merkle/src/sparse/proof.rs

@@ -58,9 +58,15 @@ pub struct InclusionProof {
 impl InclusionProof {
    pub fn verify(&self, root: &Bytes32, key: &MerkleTreeKey, value: &[u8]) -> bool {
        let Self { proof_set } = self;
+
+        if proof_set.len() > u32::MAX as usize {


Good catch! We can actually make this

if proof_set.len() > 256 {

since a valid proof set cannot have more than log(n) side nodes.

bvrooman added 5 commits December 18, 2023 13:53

SMT generate proof

fda3e0a

Passing test for proof

da38555

Update merkle_tree.rs

d8dd367

Update tests

c4efe4f

WIP

5793c2b

xgreenx linked an issue Dec 20, 2023 that may be closed by this pull request

Support SMT proofs generation #639

Closed

bvrooman added 18 commits December 21, 2023 23:42

Support inclusion and exclusion proofs

71f503f

Update CHANGELOG.md

c2e2f16

Merge branch 'master' into bvrooman/feat/smt-proofs

7146ee0

Clean up

cbd627d

Clippy

eeb8bd8

Additional tests

30df56f

Remove superfluous code

093dc4c

WIP

48f40a8

debugging

39c2e90

Fix proofs

81a9f64

Clean up

317f342

Clean up

1094d61

Clippy

3483e75

use alloc

fc08309

use alloc format

0c50d9c

Clippy

00e021e

Clean up

4d895d2

cosmetic

2cf6469

bvrooman changed the title ~~Bvrooman/feat/smt proofs~~ feat: SMT Proofs (Inclusion and Exclusion) Dec 24, 2023

bvrooman self-assigned this Dec 24, 2023

bvrooman marked this pull request as ready for review December 24, 2023 00:50

bvrooman requested a review from a team December 24, 2023 00:50

Add test for empty tree

90dbe51

MitchTurner reviewed Jan 15, 2024

View reviewed changes

bvrooman and others added 2 commits March 7, 2024 15:57

Clippy

b519be5

Another test case

0977a79

xgreenx reviewed Mar 7, 2024

View reviewed changes

bvrooman added 9 commits March 8, 2024 14:03

use debug_assert over assert

ca6edbf

Update proof.rs

a2ee4d8

Remove invalid test

c1ad59b

Fix tests

8251b01

Merge branch 'master' into bvrooman/feat/smt-proofs

3aafeb5

# Conflicts: # CHANGELOG.md

ExclusionLeaf

2506510

Clippy

faafe04

Style

d67770a

Remove dbg

9558729

xgreenx reviewed Mar 11, 2024

View reviewed changes

bvrooman and others added 3 commits March 11, 2024 12:06

Use enum for ExclusionLeaf

e961cd8

dev: refactor smt node visibility (#696)

124f183

* Refactor * Refactor * Minor * Fix import * Revert file move * Fixes + comments * Use core deref instead of std * Group public and private functions

Small nits from me=)

fa5d09c

xgreenx previously approved these changes Mar 11, 2024

View reviewed changes

bvrooman commented Mar 11, 2024

View reviewed changes

Simplify proof check

b446631

bvrooman dismissed xgreenx’s stale review via b446631 March 11, 2024 23:51

bvrooman enabled auto-merge March 12, 2024 01:37

bvrooman added this pull request to the merge queue Mar 12, 2024

xgreenx approved these changes Mar 12, 2024

View reviewed changes

Merged via the queue into master with commit 60e7483 Mar 12, 2024
38 checks passed

bvrooman deleted the bvrooman/feat/smt-proofs branch March 12, 2024 06:52

xgreenx mentioned this pull request Mar 28, 2024

Release v0.48.0 #711

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SMT Proofs (Inclusion and Exclusion) #648

feat: SMT Proofs (Inclusion and Exclusion) #648

bvrooman commented Dec 20, 2023 •

edited

Loading

MitchTurner Jan 15, 2024

xgreenx left a comment

xgreenx Mar 7, 2024

xgreenx Mar 7, 2024

bvrooman commented Mar 7, 2024 •

edited

Loading

bvrooman commented Mar 11, 2024

bvrooman commented Mar 11, 2024

xgreenx Mar 11, 2024

bvrooman Mar 11, 2024

xgreenx Mar 11, 2024

bvrooman Mar 11, 2024 •

edited

Loading

bvrooman Mar 11, 2024 •

edited

Loading

xgreenx Mar 11, 2024

xgreenx Mar 11, 2024

xgreenx Mar 11, 2024

bvrooman Mar 11, 2024 •

edited

Loading

xgreenx Mar 11, 2024

xgreenx Mar 11, 2024

bvrooman Mar 11, 2024

feat: SMT Proofs (Inclusion and Exclusion) #648

feat: SMT Proofs (Inclusion and Exclusion) #648

Conversation

bvrooman commented Dec 20, 2023 • edited Loading

Example

Example Use Case

Choose a reason for hiding this comment

xgreenx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bvrooman commented Mar 7, 2024 • edited Loading

bvrooman commented Mar 11, 2024

bvrooman commented Mar 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bvrooman Mar 11, 2024 • edited Loading

Choose a reason for hiding this comment

bvrooman Mar 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bvrooman Mar 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bvrooman commented Dec 20, 2023 •

edited

Loading

bvrooman commented Mar 7, 2024 •

edited

Loading

bvrooman Mar 11, 2024 •

edited

Loading

bvrooman Mar 11, 2024 •

edited

Loading

bvrooman Mar 11, 2024 •

edited

Loading