feat(dot/sync): improve worker pool #4258

haikoschol · 2024-10-15T12:31:21Z

The main difference in the worker pool API is that SubmitBatch() does not block until the whole batch has been processed. Instead, it returns an ID which can be used to retrieve the current state of the batch. In addition, Results() returns a channel over which task results are sent as they become available.

The main improvement this brings is increased concurrency, since results can be processed before the whole batch has been completed.

What has not changed is the overall flow of the Strategy interface; getting a new batch of tasks with NextActions() and processing the results with Process().

Changes

replaced the code in dot/sync/worker_pool.go
adapted SyncService to the API changes of the new worker pool
adapted some expectations in tests regarding how often some mocks are called (hopefully without changing the logic being tested)

Tests

go test github.com/ChainSafe/gossamer/dot/sync

Issues

Closes #4232

CLAassistant · 2024-10-15T12:31:27Z

All committers have signed the CLA.

haikoschol · 2024-10-15T12:33:09Z

Created as a draft for two reasons:

I'd like to run a sync from scratch on Westend and/or Paseo as a regression test for a while.
To discuss and possibly address this TODO.

The main difference in the worker pool API is that SubmitBatch() does not block until the whole batch has been processed. Instead, it returns an ID which can be used to retrieve the current state of the batch. In addition, Results() returns a channel over which task results are sent as they become available. The main improvement this brings is increased concurrency, since results can be processed before the whole batch has been completed. What has not changed is the overall flow of the Strategy interface; getting a new batch of tasks with NextActions() and processing the results with Process(). Closes #4232

dimartiro · 2024-10-21T13:46:20Z

dot/sync/service.go

+		workerPool: NewWorkerPool(WorkerPoolConfig{
+			MaxRetries: 5,
+			// TODO: This should depend on the actual configuration of the currently used sync strategy.
+			Capacity: defaultNumOfTasks * 10,


Why times 10?

dimartiro · 2024-10-21T14:49:43Z

dot/sync/service.go

 	ShowMetrics()
 	IsSynced() bool
 }

+type syncTask struct {


Could you please move this to fullsync.go since it is specific for that strategy?

dimartiro · 2024-10-21T14:50:04Z

dot/sync/service.go

+}
+
+func (s *syncTask) ID() TaskID {
+	return TaskID(s.request.String())


I'm not sure if this is a good ID since it is the string representation of the request, if we send the same request for multiple peers we are gonna get the same ID here.
What if we just generate an UUID?
Also, we can add a String() for the string representation that could be useful for debugging purposes

dimartiro · 2024-10-21T14:50:23Z

dot/sync/service.go

@@ -119,6 +135,11 @@ func NewSyncService(cfgs ...ServiceConfig) *SyncService {
 		waitPeersDuration:     waitPeersDefaultTimeout,
 		stopCh:                make(chan struct{}),
 		seenBlockSyncRequests: lrucache.NewLRUCache[common.Hash, uint](100),
+		workerPool: NewWorkerPool(WorkerPoolConfig{
+			MaxRetries: 5,


Could you move the magic numbers to a const? 😄

dimartiro · 2024-10-21T15:35:16Z

dot/sync/fullsync.go

-					},
-				})
-			}
+	task, ok := result.Task.(*syncTask)


What if we define result.Task over a generic? so we can skip this casting?

dimartiro · 2024-10-21T15:52:37Z

dot/sync/fullsync.go


-			continue
-		}
+	request := task.request.(*messages.BlockRequestMessage)


you can skip this cast if you change the syncTask.request from messages.P2PMessage to *messages.BlockRequestMessage since (if i'm not wrong) this is the only type we are expecting

haikoschol force-pushed the haiko/sync-worker-pool branch 4 times, most recently from 987cb6f to 4c0d5cb Compare October 17, 2024 08:07

haikoschol force-pushed the haiko/sync-worker-pool branch from 4c0d5cb to c875d08 Compare October 18, 2024 03:18

dimartiro reviewed Oct 21, 2024

View reviewed changes

dimartiro requested changes Oct 21, 2024

View reviewed changes

dimartiro reviewed Oct 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dot/sync): improve worker pool #4258

feat(dot/sync): improve worker pool #4258

haikoschol commented Oct 15, 2024 •

edited

Loading

CLAassistant commented Oct 15, 2024 •

edited

Loading

haikoschol commented Oct 15, 2024 •

edited

Loading

dimartiro Oct 21, 2024

dimartiro Oct 21, 2024

dimartiro Oct 21, 2024

dimartiro Oct 21, 2024

dimartiro Oct 21, 2024

dimartiro Oct 21, 2024 •

edited

Loading

feat(dot/sync): improve worker pool #4258

Are you sure you want to change the base?

feat(dot/sync): improve worker pool #4258

Conversation

haikoschol commented Oct 15, 2024 • edited Loading

Changes

Tests

Issues

CLAassistant commented Oct 15, 2024 • edited Loading

haikoschol commented Oct 15, 2024 • edited Loading

dimartiro Oct 21, 2024

Choose a reason for hiding this comment

dimartiro Oct 21, 2024

Choose a reason for hiding this comment

dimartiro Oct 21, 2024

Choose a reason for hiding this comment

dimartiro Oct 21, 2024

Choose a reason for hiding this comment

dimartiro Oct 21, 2024

Choose a reason for hiding this comment

dimartiro Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

haikoschol commented Oct 15, 2024 •

edited

Loading

CLAassistant commented Oct 15, 2024 •

edited

Loading

haikoschol commented Oct 15, 2024 •

edited

Loading

dimartiro Oct 21, 2024 •

edited

Loading