Unify engines #3405

yacovm · 2024-09-19T21:26:07Z

Why this should be merged

The lifecycle of an avalanchego node can be separated to three main stages:

Stage I - If supported by the VM, then a state sync occurs. Otherwise, this stage is skipped.
In the X-chain, this stage is entirely substituted by a custom bootstrapping phase that replicates and executes the X-chain DAG.

Stage II - A stage called bootstrapping, which entails replicating the blocks from other avalanchego nodes, and executing them.

Stage III - The final form of an avalanchego node, in which it runs the snowman consensus protocol.

The phases are implemented by components called "engines":

avalanche bootstrapping
snowman
state syncer
snowman bootstrapper

And the handler in snow/networking/handler/ is responsible to route messages from the network into the correct engine.
Engines all implement the same common.Engine interface, but the interface consists of the union of all operations across all engines.

Indeed, it is often the case that a message of type m dispatched by engine e, cannot be processed by a different engine e'.
For instance, a Chits message cannot be processed by any engine other than consensus, and a message about a query of state summary is only relevant to the state sync engine.

To that end, each engine simply implements a no-op dispatcher for messages it does not care about.

The biggest problem with the existing aforementioned structure, is that the lifecycle of the engines imposes a strict one way movement across the various stages,
and there is no single component which consolidates the transition among the stages. The movement between the various stages takes place by a callback given to each
engine at every stage but the final one (Stage I and Stage II).

The structure therefore makes it difficult to introduce movement from snowman consensus back to bootstrapping / state sync, or to have better control over the message dispatch.

This commit unifies all engines into a single one under snow/engine/unified

As a result, the implementation of the handler in snow/networking/handler/handler.go is now simpler,
as it only interacts with the unified engine, and it never needs to query the snow.EngineState.

The state transition between the various stages is now taken care of by the unified engine, and since the code to dispatch messages to the right engine
is now all in the unified engine, it's not only more testable but it lets us move among the stages in the same place where we consider the stage we're in to dispatch
the message to the correct engine.

How this works

Just a refactoring: I shrinked the interfaces of the engines to what is absolutely required and now only the unified engine implements the common.Engine. Removed un-necessary code from the handler package and moved the logic into the unified engine.

How this was tested

CI.

The lifecycle of an avalanchego node can be separated to three main stages: Stage I - If supported by the VM, then a state sync occurs. Otherwise, this stage is skipped. In the X-chain, this stage is entirely substituted by a custom bootstrapping phase that replicates and executes the X-chain DAG. Stage II - A stage called bootstrapping, which entails replicating the blocks from other avalanchego nodes, and executing them. Stage III - The final form of an avalanchego node, in which it runs the snowman consensus protocol. The phases are implemented by components called "engines": - avalanche bootstrapping - snowman - state syncer - snowman bootstrapper And the handler in snow/networking/handler/ is responsible to route messages from the network into the correct engine. Engines all implement the same common.Engine interface, but the interface consists of the union of all operations across all engines. Indeed, it is often the case that a message of type `m` dispatched by engine `e`, cannot be processed by a different engine `e'. For instance, a Chits message cannot be processed by any engine other than consensus, and a message about a query of state summary is only relevant to the state sync engine. To that end, each engine simply implements a no-op dispatcher for messages it does not care about. The biggest problem with the existing aforementioned structure, is that the lifecycle of the engines imposes a strict one way movement across the various stages, and there is no single component which consolidates the transition among the stages. The movement between the various stages takes place by a callback given to each engine at every stage but the final one (Stage I and Stage II). The structure therefore makes it difficult to introduce movement from snowman consensus back to bootstrapping / state sync, or to have better control over the message dispatch. This commit unifies all engines into a single one under snow/engine/unified As a result, the implementation of the handler in snow/networking/handler/handler.go is now simpler, as it only interacts with the unified engine, and it never needs to query the snow.EngineState. The state transition between the various stages is now taken care of by the unified engine, and since the code to dispatch messages to the right engine is now all in the unified engine, it's not only more testable but it lets us move among the stages in the same place where we consider the stage we're in to dispatch the message to the correct engine. Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>

marun · 2024-10-18T18:15:51Z

In the X-chain, this stage is entirely substituted by a custom bootstrapping phase that replicates and executes the X-chain DAG.

This doesn't sound like state syncing. Why is this performed in Stage I instead of Stage II?

marun · 2024-10-18T18:16:31Z

chains/manager.go

-			Bootstrapper: bootstrapper,
-			Consensus:    engine,
-		},
+	ctx.State.Set(snow.EngineState{


(No action required) Maybe determine the appropriate state (snow.StateSyncing or snow.Bootstrapping) and then perform Set?

marun · 2024-10-18T18:17:02Z

snow/engine/common/bootstrapable.go

+	"context"
+
+	"github.com/ava-labs/avalanchego/api/health"
+)

 type BootstrapableEngine interface {


(No action required) What differentiates a BootstrapableEngine from an AvalancheBootstrapableEngine? Maybe use a docstring to document?

marun · 2024-10-18T18:17:31Z

snow/engine/common/bootstrapable.go

-	Engine
+	AcceptedFrontierHandler
+	AcceptedHandler
+	AncestorsHandler


(No action required) Would it make sense to embed the AvalancheBootstrapableEngine interface since only the Accepted* embeddings are unique to this interface?

marun · 2024-10-18T18:18:01Z

snow/engine/common/traced_is_enabled.go

+	"github.com/ava-labs/avalanchego/trace"
+)
+
+type Enabler interface {


(No action required) Enabler implies that it enables something, but the only method would appear to be a getter. Maybe EnabledChecker?

marun · 2024-10-18T18:18:54Z

snow/engine/unified/engine.go

+
+type OnFinishedFunc func(ctx context.Context, lastReqID uint32) error
+
+type Factory interface {


(No action required) Maybe move this to factory.go?

marun · 2024-10-18T18:30:55Z

snow/engine/unified/engine_test.go

+			ctx.State.Set(testCase.state)
+
+			var vm enginetest.VM
+			configureVMRouting(&vm)


Why does this need to be called if routing isn't being checked?

marun · 2024-10-18T18:31:37Z

snow/engine/unified/engine_test.go

+			configureVMRouting(&vm)
+
+			engine, err := unified.EngineFromEngines(ctx, ef, &vm)
+			require.NoError(t, err)


(No action required) Maybe use require = require.New(t) for a given scope rather than passing t to every assertion?

marun · 2024-10-18T18:32:02Z

snow/engine/unified/engine_test.go

+	return invokedTable
+}
+
+func createEngineFactory(t *testing.T, gs enginetest.Engine, as enginetest.Engine, sm enginetest.Engine, stateSyncer mockStateSyncer, avalancheSyncer mockStateSyncer, bootstrapper mockStateSyncer) *mock.Factory {


(No action required) Maybe refactor the code under test to enable validation without the use of a mock?

marun · 2024-10-18T18:32:38Z

snow/engine/unified/engine_test.go

+}
+
+func createEngineInvocationMap(t *testing.T) map[string]func(*unified.Engine) {
+	m := make(map[string]func(*unified.Engine))


(No action required) Maybe use map[string]func(*unified.Engine) error instead? That would avoid a dependency on t (the error check could be performed by the test) and allow the map to be defined once rather than per sub-test.

marun · 2024-10-18T18:32:58Z

snow/engine/unified/factory_test.go

+	smbootstrap "github.com/ava-labs/avalanchego/snow/engine/snowman/bootstrap"
+)
+
+func TestFactory(t *testing.T) {


(No action required) Maybe skip checking trivial getters, and focus on specific methods whose correctness is difficult to reason about by observation?

yacovm requested a review from StephenButtolph as a code owner September 19, 2024 21:26

yacovm marked this pull request as draft September 19, 2024 21:26

yacovm self-assigned this Sep 23, 2024

yacovm force-pushed the unifiedEngine branch 19 times, most recently from 788480d to 9d6c351 Compare October 13, 2024 20:55

yacovm force-pushed the unifiedEngine branch 4 times, most recently from 92df2c8 to fd73ebb Compare October 15, 2024 22:06

yacovm marked this pull request as ready for review October 16, 2024 12:02

yacovm requested a review from marun as a code owner October 16, 2024 12:02

yacovm force-pushed the unifiedEngine branch from fd73ebb to d2505c2 Compare October 16, 2024 16:38

marun mentioned this pull request Oct 18, 2024

Fallback to bootstrapping #3480

Draft

marun reviewed Oct 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify engines #3405

Unify engines #3405

yacovm commented Sep 19, 2024 •

edited

Loading

marun commented Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024

marun Oct 18, 2024


		type OnFinishedFunc func(ctx context.Context, lastReqID uint32) error

		type Factory interface {

Unify engines #3405

Are you sure you want to change the base?

Unify engines #3405

Conversation

yacovm commented Sep 19, 2024 • edited Loading

Why this should be merged

How this works

How this was tested

marun commented Oct 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yacovm commented Sep 19, 2024 •

edited

Loading