Skip to content
This repository has been archived by the owner on Jul 1, 2021. It is now read-only.

Free the fork choice #666

Merged
merged 15 commits into from
Jun 3, 2019
Merged

Conversation

ralexstokes
Copy link
Member

@ralexstokes ralexstokes commented May 24, 2019

EDIT: update description to reflect current state of PR, may be helpful to start w/ Summary of changes below

What was wrong?

Big part of implementing LMG GHOST fork choice.

Pulling this out of #532 as it is substantial enough to stand on its own.

In Eth1, the fork scoring function is just a function of the block header. This fact makes it easy to calculate the score as we import new headers, adjusting the canonical head along the way.

In Eth2, the fork scoring function is more complex (a function of the block, the state, and off-chain attestations); moreover, we want to run the fork choice in more situations than just importing a new valid block.

The current design of the fork choice calculation does not readily facilitate these tasks.

How was it fixed?

Summary of changes:

  • Extend API of persist_block to allow arbitrary scoring
  • Add set_score API (not technically required for this PR but will need for future fork choice work. easier to include here than patch up the git history... forgive me for the lack of precision :) )
  • Update the tests!

To address these limitations, we allow the caller of persist_block to provide their own scoring logic.
The DB class now exposes methods to persist the fork choice but does not actively compute it (at least via the new API persist_block_without_scoring). The particular fork choice lives to a particular variation of the StateMachine. For example, the SerenityStateMachine will have the lmd_ghost_fork_scoring rule after it lands in a subsequent PR. For now, they just use the existing higher_slot_scoring rule.

Rationale for putting the fork choice scoring in the StateMachine and using it in the Chain class

The Chain class is what ultimately orchestrates validating incoming data, persisting valid data, running the fork choice (based on the current state machine's scoring rule) and persisting the results of this fork choice (i.e. writing the new block's score and updating the canonical head if necessary).

This configuration is more flexible, exposing a direct path to inject the AttestationPool necessary for LMD GHOST, and avoids the situation where the DB knows about every other object in the system (single responsibility and all that).

I have some concerns mainly around the new API(s). It might be dangerous to expose something like set_score for anyone to call and update_canonical_head_if_needed feels a little clunky still... I would suggest we merge this in and as we gain confidence w/ the new stuff, we can look at deprecating the old stuff (namely, persist_block and its callers).

Other comments

We could make the argument to pull the actual process of selecting which fork to follow (given a "score chain") but this seems low value given that we can map any fork choice to "follow the monotonically increasing score" by warping the fork choice scoring function).

There is another concern around "overloading" the state machine w/ the fork choice -- it seems like the best place to put it for now but readers should note that the state machine is no longer just a transformation from block to block. To justify this change, let's answer the question: if we change the fork choice (at some slot), does that imply a new state machine?

To-Do

I'll add more tests but given there are some design questions up in the air, I'll get this up so you can make a pass at reviewing it if you want...

  • Add more tests to cover new APIs

Cute Animal Picture

put a cute animal picture link inside the parentheses

@ralexstokes
Copy link
Member Author

ralexstokes commented May 28, 2019

@hwwhww given that you mentioned you would make a pass at this -- another option here is to keep
persist_block and just add another parameter which is the fork choice.

EDIT: scratch this -- we still have the issue of tying the fork choice to the block import w/ this route... readers should ignore this comment, the problem that was here has been resolved in this PR :)

so persist_block(block, block_class) becomes persist_block(block, block_class, fork_choice_scoring)

as i'm writing these tests, i'm having trouble thinking of cases where you would really want to persist a block w/o scoring it and finding the new head... the goal w/ this entire stream of work is so that we can execute arbitrary logic when running the fork choice.

one way to do this is how this PR currently is (expose the pieces and let the client of this code 'deal w/ it'). the other way i'm alluding to in this comment is that we just pass in this code as the fork_choice_scoring.

e..g for LMD GHOST, something like:

def lmd_ghost_fork_choice_scoring(attestation_pool):
    return lambda block, chaindb: run_lmd_ghost(block, chaindb, attestation_pool)


# something using a `ChainDB`:
scoring = lmd_ghost_fork_choice_scoring(get_attestation_pool())
new_blocks, old_blocks = chaindb.persist_block(block, block.__class__, scoring)

you have a preference for one way or the other? i think this latest way avoids some of the security/usability concerns of the way that this PR currently proposes...

@ralexstokes ralexstokes marked this pull request as ready for review May 28, 2019 23:31
return chaindb


@pytest.fixture(params=[0, 10, 999])
@pytest.fixture(params=[1, 10, 999])
def block(request, sample_beacon_block_params):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to not conflict w/ genesis block

@ralexstokes
Copy link
Member Author

ok! this is a little messy as it tracked my thinking about the best way to get towards a more flexible fork choice :)

after I rebase, it should be ready to merge pending review. the tests all pass locally but circleCI is down at the moment.

Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't claim this to be a thorough review, but at a conceptual level, this seems very appropriate. I think cleaning up the eth1 model to use this pattern which I think qualifies as composition indeed does make sense.

@ralexstokes ralexstokes force-pushed the free-the-fork-choice branch 6 times, most recently from 7e09ea0 to e0b8f3b Compare May 29, 2019 22:18
@ralexstokes
Copy link
Member Author

just to update, this is officially ready for review now :)


# methods
@staticmethod
def create_block_from_parent(parent_block: BaseBeaconBlock,
block_params: FromBlockParams) -> BaseBeaconBlock:
return create_serenity_block_from_parent(parent_block, block_params)

def get_fork_choice_scoring(self) -> ForkChoiceScoring:
return higher_slot_scoring
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the lmd_ghost looks like in this method?
Does the method produce new functions for every slot and block?

My imagination:

def get_fork_choice_scoring(self) -> ForkChoiceScoring:
    def scoring(block):
       return lmd_ghost_scoring(db=self.chaindb.db, attestation_pool=somewhere.attestation_pool, state=self.state, start_block=somewhere.start_block, target_block=block, config=self.config)
    return scoring

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah more or less! modulo those parameters changing, e.g. i don't think the StateMachine has a reference directly to the chain_db and similar for the other dependencies...

i'll think a bit about the best way to keep a unified function interface while still giving us flexibility...

there is also a concern about how often this closure is created -- it would be possible to memoize this function, esp if the attestation pool is a reference to the mutable thing, not just a description of attestations -- this route generally goes against our preference for immutability though so i'd be inclined to find something more, well, immutable :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChihChengLiang latest thinking here:

08523d3

ec29a72

like i say in the PR, i'm still chewing on a clean way to further modularize the fork choice so it is not so tightly bound to a particular StateMachine.

@ralexstokes ralexstokes mentioned this pull request May 31, 2019
3 tasks
Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review!

Rationale for putting the fork choice scoring in the StateMachine and using it in the Chain class

if we change the fork choice (at some slot), does that imply a new state machine?

My two cents:

  • Agreed that it’s not ideal to embed the fork choice scoring in BeaconChain!
  • However, regarding the efficiency, at this point I’m not sure if the fork choice scoring upgrade would be necessary for the future. (Comparing to eth1 fork choice rule didn’t change much).
    • If we are certain that the fork choice rule upgrade is exactly as the StateMachine (forks) upgrade boundary, we can make sure that in each batch is using the same fork choice rule scoring function (based on the TBD wire protocol), so that we may be able to initialize less state_machines during syncing.
  • state_machine.get_fork_choice_scoring() function refactoring: instead of returning a Callable, it seems similar to how eth1 eth/vm/forks/ overrides the static methods. For example, the compute_difficulty in Frontier and in HomesteadVM. I think we can utilize this pattern in SerenityStateMachine. Defer to the snake charmers on which pattern is more appropriate in this case. :)


from eth2.beacon.types.blocks import BaseBeaconBlock

ForkChoiceScoring = Callable[[BaseBeaconBlock], int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about naming it ForkChoiceScoringFn?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah appending Fn seems ok

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thinking was that earlier in this process I wasn't sure this abstraction would just be a Callable.... the ForkChoice could evolve so that it is more convenient to have more state, in which case we could just go ahead w/ a class and ForkChoiceScoring would become an abstract class (for the interface)...

but i think we can get away w/ just a Callable so let's do that for now

tests/core/p2p-proto/bcc/helpers.py Show resolved Hide resolved
@ralexstokes
Copy link
Member Author

ralexstokes commented Jun 1, 2019

* `state_machine.get_fork_choice_scoring()` function refactoring: instead of returning a `Callable`, it seems similar to how eth1 `eth/vm/forks/` overrides the static methods. For example, the `compute_difficulty` [in `Frontier`](https://github.com/ethereum/py-evm/blob/915cec2a475176ad9722869e845dca8bac7a66d8/eth/vm/forks/frontier/__init__.py#L75) and [in `HomesteadVM`](https://github.com/ethereum/py-evm/blob/915cec2a475176ad9722869e845dca8bac7a66d8/eth/vm/forks/homestead/__init__.py#L39). I think we can utilize this pattern in `SerenityStateMachine`. Defer to the snake charmers on which pattern is more appropriate in this case. :)

@hwwhww yeah i went down this route before I landed on the current thing but decided to do this mainly because I ran into issues w/ mypy. and so I just looked at the links you added for the eth1 stuff and guess what... they just # type: ignore.

my thinking was to just do the simple thing to keep moving :)

the one caveat is that w/ LMD GHOST there is basically a "setup" step where we need to fetch the latest justified state and it seems much more ergonomic as an instance method... i'll refer you here: https://github.com/ethereum/trinity/pull/685/files#diff-26594f899e3bcc74e868bbab3824d8f9R46

@ralexstokes
Copy link
Member Author

unless anyone has further comment or concern, i'll go ahead and merge this soon... i'll address #666 (comment) in #685 which builds off of this PR

@hwwhww
Copy link
Contributor

hwwhww commented Jun 1, 2019

@ralexstokes

issues w/ mypy

Right, I recalled it was really weird: ethereum/py-evm#1362 (comment) :'(

i'll go ahead and merge this soon... i'll address #666 (comment) in #685 which builds off of this PR

Sounds good to me!

@ralexstokes ralexstokes merged commit 9fe8fa8 into ethereum:master Jun 3, 2019
@ralexstokes ralexstokes deleted the free-the-fork-choice branch June 3, 2019 21:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants