Distibuted work fixes #2230

guilhermelawless · 2019-08-20T13:03:48Z

Distributed work was having some issues:

Work cancels were not being sent at all - bad file descriptor. The socket was not specified with the required host and port
Never using local work generation if any work peer is set. This is a problem if the work peers are unresponsive. If all work peers are unresponsive, the node starts using local generation immediately, for all subsequent requests until one work peer responds correctly.
Work requests hanging and not being handled correctly when destroying the distributed work object
(general) work pool was not being stopped by node::stop() until work finished

So those are all fixed, and added a startup log if there is no local work generation set and no work peers defined.

… requests

nano/node/node.cpp

SergiySW

Don't like generating work if node has remote peers, but seems good

This change enhances the previous behavior. Local work generation is only used after all peers are unresponsive. A flag is set in the node (unresponsive_work_peers) so that for the next distributed work, local generation will start immediately. Work peer requests are still sent, and as soon as one replies with valid work, local generation is delayed again, until all are unresponsive, and so on. This is more of a fallback mechanism when all peers are failing, as the previous behavior would always wait for timeouts on peers (which can be long, 2 minutes here). The only case not handled for simplicity is when multiple work is queued, and the first one has unresponsive peers. In this case, the currently queued work requests will not start work generation immediately, only for the next queued distributed work.

guilhermelawless · 2019-08-23T07:43:33Z

@SergiySW please see last commit
0c5b8e9 and its commit message. Local generation is now only started as a last resort and, for future requests, only as long as peers are unresponsive.

guilhermelawless added 4 commits August 20, 2019 12:18

Fix work_cancel not being sent and some unreachable peers hanging the…

6f17391

… requests

Start local work generation along with work peer requests

bdcc24d

Work cancel done in a new socket

2dc73b4

Log if work generation cannot be performed

8e289ca

guilhermelawless added bug quality improvements This item indicates the need for or supplies changes that improve maintainability labels Aug 20, 2019

guilhermelawless added this to the V20.0 milestone Aug 20, 2019

guilhermelawless requested review from clemahieu and cryptocode August 20, 2019 13:03

guilhermelawless self-assigned this Aug 20, 2019

guilhermelawless added 5 commits August 20, 2019 14:06

Making sure stop is only called once

dabbe5c

Simplifying connection handling

4efe2eb

Fix bad address from unordered map

6658997

Stop on destructor

0d0b3f5

Correctly stopping work generation on node stop

453fb89

cryptocode reviewed Aug 20, 2019

View reviewed changes

nano/node/node.cpp Outdated Show resolved Hide resolved

nano/node/node.cpp Outdated Show resolved Hide resolved

guilhermelawless added 2 commits August 20, 2019 19:04

Use atomic<bool> and exchange

fb06b76

node.work is not stopped on node.stop due to testing setup

e97006e

cryptocode approved these changes Aug 20, 2019

View reviewed changes

guilhermelawless requested a review from SergiySW August 21, 2019 22:38

SergiySW approved these changes Aug 22, 2019

View reviewed changes

guilhermelawless requested a review from SergiySW August 23, 2019 07:44

guilhermelawless added 4 commits August 23, 2019 10:14

Also start local generation immediately if there are no work peers

dcdfa2e

Robustify

0985b1c

Unbreak tests

b4d04e7

No need to wrap in node background

56665bd

SergiySW approved these changes Aug 23, 2019

View reviewed changes

Merge branch 'master' into distibuted-work-fixes

77b5689

guilhermelawless removed the request for review from clemahieu August 23, 2019 18:31

zhyatt merged commit ddf4d66 into nanocurrency:master Aug 23, 2019

guilhermelawless deleted the distibuted-work-fixes branch August 23, 2019 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distibuted work fixes #2230

Distibuted work fixes #2230

guilhermelawless commented Aug 20, 2019 •

edited

Loading

SergiySW left a comment

guilhermelawless commented Aug 23, 2019 •

edited

Loading

Distibuted work fixes #2230

Distibuted work fixes #2230

Conversation

guilhermelawless commented Aug 20, 2019 • edited Loading

SergiySW left a comment

Choose a reason for hiding this comment

guilhermelawless commented Aug 23, 2019 • edited Loading

guilhermelawless commented Aug 20, 2019 •

edited

Loading

guilhermelawless commented Aug 23, 2019 •

edited

Loading