Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix walk with partial traverse #111

Merged
merged 8 commits into from
Jun 19, 2020

Conversation

carver
Copy link
Contributor

@carver carver commented Jun 11, 2020

What was wrong?

When traversing to a node, we occasionally get TraversedPartialPath exceptions (if the trie changed in the middle of walking it). In that case, we want some help from the exception to continue our walk. Previously, we would build the sub-segments manually, but that was error-prone, and turns out that it got broken in a refactor and we didn't notice!

How was it fixed?

Fix the broken TPP handling in the walk tests. Add a new simulated_node which helps simplify the handling.

Also, add a new hypothesis strategy that is more likely to catch issues like this, by focusing on building some long extension nodes. (which is not super likely to happen in a smooth uniform random distribution of keys)

PR is nearly ready, just waiting to merge #113 and #114 and rebase.

TODO:

  • changelog
  • readme update
  • squashing
  • squashing (after review)

Cute Animal Picture

Cute animal picture

@carver carver mentioned this pull request Jun 11, 2020
2 tasks
@carver carver force-pushed the fix-walk-with-partial-traverse branch 2 times, most recently from a6da6b9 to b2283ed Compare June 13, 2020 00:28
@carver

This comment has been minimized.

@carver carver force-pushed the fix-walk-with-partial-traverse branch 2 times, most recently from 3a509e2 to b090145 Compare June 18, 2020 18:48
@carver carver marked this pull request as ready for review June 18, 2020 18:50
@carver carver force-pushed the fix-walk-with-partial-traverse branch from b090145 to 4410e38 Compare June 18, 2020 21:26
We can now reproduce better examples and more reliably, and have better
explanations for what functionality they are testing.
It's important to prune while walking because we want to run mid-walk
trie updates. Those updates should delete access to old nodes, to
simulate losing access to old trie nodes from peers that have moved on.
This makes sure we don't request trie nodes that don't exist anymore
during backfill-only type scenarios.

Depends on ethereum#113
@carver carver force-pushed the fix-walk-with-partial-traverse branch 3 times, most recently from f767531 to 0cf0490 Compare June 18, 2020 22:28
exc.nibbles_traversed + next_segment
for next_segment in exc.node.sub_segments
]
sub_segments = exc.simulated_node.sub_segments
Copy link
Contributor Author

@carver carver Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire PR is essentially motivated by wanting to enable this change. Not just in a couple tests, but as the "recommended" approach for handling TraversedPartialPath while walking a trie. (ie~ the readme changes)

The old approach was a bit cumbersome, but it was also broken. We didn't even notice because the code path wasn't touched! So most of the testing changes are about increasing the chance of catching this bug and other similar bugs. (Which did expose other issues in PRs that are now merged)

@carver carver requested a review from cburgdorf June 18, 2020 22:34
@carver
Copy link
Contributor Author

carver commented Jun 18, 2020

If the PR feels a little overwhelming, each commit in order might be easier to review.

CHANGELOG Outdated
@@ -12,6 +12,9 @@ Features
- Can now use NodeIterator to navigate to the empty key b'', using NodeIterator.next(key=None) or
simply NodeIterator.next().
https://github.com/ethereum/py-trie/pull/110
- TraversedPartialPath has a new simulated_node attribute, which you can treat as a node that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: which you we can

README.md Outdated
# We can now continue exploring the children of the extension node, by using an attribute on the exception:
>>> sub_segments = last_exception.simulated_node.sub_segments
((0x6, 0x5, 0x7, 0x9),)
# Note that this sub-segment now carries you the rest of the way to the child
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice the file has mixed narrative perspective anyway but since most of it seems to be following the "we" form I'm going to point out this last grammar nit before I try to understand what's actually going on here haha.

Note that this sub-segment now carries you us the rest of the way to the child of the node that you we only partially traversed into.

Copy link
Contributor

@cburgdorf cburgdorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how much effort you put into documenting this 👍 It's still hard to digest but that's just on my part because I haven't had to deal with the internals of tries yet (Though your walk-throughs have definitely helped to build up some knowledge)

I left a few comments inline. But nothing too important (still curious about the local import though)

]
# We can now continue exploring the children of the extension node, by using an attribute on the exception:
>>> sub_segments = last_exception.simulated_node.sub_segments
((0x6, 0x5, 0x7, 0x9),)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

),
)
@pytest.mark.parametrize('key_encoding', (compute_extension_key, compute_leaf_key))
def test_valid_TraversedPartialPath_untraversed_nibbles(valid_nibbles, key_encoding):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odd casing for the method name but I know it's on purpose and if you happen to like that sort of thing I won't argue against it :)

assert simulated_node.suffix == ()
assert simulated_node.raw[0] == compute_extension_key(remaining_key)
else:
raise NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit odd to raise a NotImplementedError from a test case. Is this supposed to be a placeholder for covering additional test cases in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess we usually say something like raise Exception("Unreachable"). The line of thinking was that the test needs to implement how to handle the different ways of encoding a key. But I'll change it to stick to the more team-standard way of doing it. 👍

return self._simulated_node

def _make_simulated_node(self) -> HexaryTrieNode:
from trie.utils.nodes import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason for the local import?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the exceptions module usually shouldn't import from the rest of the project, because it often will cause an import cycle. I think it actually does cycle here if I try to import at the top. Even if it doesn't though, I don't want to leave a landmine for the next contributor.

This is useful when traversing, to treat the trie "as if" there had been
a node at the place you traversed to.

For example, say you traverse to path (0xf, 0) into the root node, which
is an extension node. It has a key extension of (0xf, 0, 0xa, 9), so a
TraversedPartialPath is raised. The simulated node will be an extension
with key (0xa, 9), with the same child node.

Also adds the changelog entry, and updates the readme. (The readme had
the bug in plain sight all along! Can you spot it? Look for the change
in the fog prefixes that happened after using the new simulated_node
API).
This fixes test_trie_walk_root_change_with_traverse(), which incorrectly
handled sub_segments during a TraversedPartialPath exception.

Tweak hypothesis to find the scenario: sometimes it helps to have nodes
stored to the database, instead of embedded. We can now force nodes to
be encoded, by increasing the minimum_value_length.

Now that we can reproduce the problem, fix the test. The chosen solution
is to use the new TraversedPartialPath.simulated node, and use the
sub_segments from that simulated node. This requires very little work
from the caller.
This bug is already fixed in:
ethereum#109

But it's worth having the example here anyway, as a backup.
Recently, there was a fix for incorrectly raising a TraversedPartialPath
when traversing into a leaf, even if the suffix of the leaf did not
match the traversal path. See:
ethereum#114

This bug was actually found during a hypothesis run during a trie walk.
Although there are new unit tests making sure the leaf traversal works
properly, it seems worth adding the example explicitly as a kind of
backup test that the scenario works properly during a trie walk &
backfill.
This is a cleanup for the backfill tests, using the shared strategy to
try to expand the surface area of the hypothesis tests.

Note:

- The trie_from_keys() builder with a minimum_value_length=0 would try
to "insert" at key b'' the value b'', which is of course a delete.  So
the smallest permissable size must be a byte length of 1.

- An empty trie is not handled well by these backfill tests, but
backfill is entirely uninteresting for an empty trie, so we can ignore
the case.
- find and fix bug with handling sub-segments during
TraversedPartialPath exception, while walking over the trie, using
cached nodes, with traverse_from() in the test
test_trie_walk_root_change_with_cached_traverse_from()
- add example() to make sure that case ^ is tested in both tests about
walking over a changed trie.
- modify hypothesis inputs a bit to encourage hypothesis to find
interesting scenarios like the above example. The new approach
did find a relevant example once or twice, but needed low-thousands of
examples.
- Test the traverse_from() scenario with both approaches: emptying the
cache and keeping it. Handle both scenarios properly.
@carver carver force-pushed the fix-walk-with-partial-traverse branch from 0cf0490 to 3ff167d Compare June 19, 2020 16:22
@carver carver merged commit 552a834 into ethereum:master Jun 19, 2020
@carver carver deleted the fix-walk-with-partial-traverse branch June 19, 2020 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants