Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Substrate drops gossamer nodes #2595

Closed
kishansagathiya opened this issue Jun 10, 2022 · 4 comments · Fixed by #2662
Closed

Substrate drops gossamer nodes #2595

kishansagathiya opened this issue Jun 10, 2022 · 4 comments · Fixed by #2662
Assignees

Comments

@kishansagathiya
Copy link
Contributor

kishansagathiya commented Jun 10, 2022

Describe the bug

  • When we a run network with a substrate and a gossamer node in it. Substrate would eventually drop it. It does not happen always, but it happens rather frequently.

Possible Solution

It looks like we are sending a block that was not requested. I believe that most likely is the reason, because right after that substrate drops gossamer.

2022-06-10 18:53:34.681 DEBUG tokio-runtime-worker sync: Received block that was not requested. Requested Hash(0xc22e325a230162748411c3aa7701a24d8e264a61a5662fc6b5c7e13034afe5bc), got Some(Header { parent_hash: 0xee87cc7ac5e87460094ec4ba29bbf8f28fb7741b69e7edb142233c3dd990b365, number: 1, state_root: 0x358d9a63263eeacf5a753316e2fa7e45405e5a152c32a2546961fdcb3c47d673, extrinsics_root: 0x97ed434301435ad255ef38db54a17a50b13f16985cdee32c49835a0faa405123, digest: Digest { logs: [DigestItem::PreRuntime([66, 65, 66, 69], [3, 0, 0, 0, 0, 116, 209, 168, 24, 0, 0, 0, 0, 48, 247, 238, 88, 232, 162, 220, 223, 135, 104, 230, 159, 199, 113, 199, 243, 60, 205, 76, 56, 4, 45, 39, 197, 106, 228, 110, 170, 128, 167, 42, 39, 235, 121, 30, 38, 236, 154, 127, 30, 100, 144, 31, 236, 246, 105, 47, 254, 120, 236, 178, 192, 70, 162, 110, 123, 117, 92, 237, 94, 82, 50, 179, 0, 61, 203, 4, 126, 246, 19, 158, 138, 198, 109, 196, 244, 84, 21, 213, 37, 43, 217, 125, 125, 78, 87, 197, 94, 87, 126, 4, 250, 86, 52, 13, 8]), DigestItem::Consensus([66, 65, 66, 69], [1, 12, 212, 53, 147, 199, 21, 253, 211, 28, 97, 20, 26, 189, 4, 169, 159, 214, 130, 44, 133, 88, 133, 76, 205, 227, 154, 86, 132, 231, 165, 109, 162, 125, 1, 0, 0, 0, 0, 0, 0, 0, 142, 175, 4, 21, 22, 135, 115, 99, 38, 201, 254, 161, 126, 37, 252, 82, 135, 97, 54, 147, 201, 18, 144, 156, 178, 38, 170, 71, 148, 242, 106, 72, 1, 0, 0, 0, 0, 0, 0, 0, 144, 181, 171, 32, 92, 105, 116, 201, 234, 132, 27, 230, 136, 134, 70, 51, 220, 156, 168, 163, 87, 132, 62, 234, 207, 35, 20, 100, 153, 101, 254, 34, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), DigestItem::Seal([66, 65, 66, 69], [132, 221, 173, 106, 54, 91, 32, 133, 8, 188, 216, 164, 9, 83, 204, 149, 224, 200, 66, 194, 142, 221, 3, 15, 86, 88, 178, 6, 194, 68, 87, 73, 29, 64, 173, 25, 93, 248, 211, 112, 252, 186, 186, 249, 79, 93, 20, 105, 50, 99, 40, 160, 38, 230, 235, 177, 64, 107, 16, 43, 187, 199, 214, 135])] } }).    
2022-06-10 18:53:34.681 DEBUG tokio-runtime-worker sync: 12D3KooWN6W3yZYrd5FN8ysPGD5EMryvG4TS6EoNkkzAxmWr37wf disconnected    

To Reproduce

Steps I took to reproduce the behaviour:

./bin/gossamer --key bob --bootnodes /ip4/127.0.0.1/tcp/30333/p2p/12D3KooWShRJhQnccFh7aRXPKnjAnUG4x3Lrwfw4wLgTFyDVfqh7 --port 7010 --basepath /tmp/bob --genesis ../substrate-node-template/genesis.json --log trace > bob.log

./bin/gossamer --key charlie --bootnodes /ip4/127.0.0.1/tcp/30333/p2p/12D3KooWShRJhQnccFh7aRXPKnjAnUG4x3Lrwfw4wLgTFyDVfqh7 -port 7011 --basepath /tmp/charlie --genesis ../substrate-node-template/genesis.json --log trace > charlie.log

  1. Use Gossamer testing runtime substrate-node-template#1 for substrate node
  2. Create genesis file using substrate node template.
  3. Run Substrate usingtarget/release/gssmr-test-node --alice --chain genesis.json --log=debug &> debug.log
  4. Run two gossamer nodes
  5. Wait for blocks to get produced. Keep an eye on substrate logs. It should eventually drop from (2 peers) to (1 peers) to (0 peers)

Log output

Logs of all three nodes in this gist https://gist.github.com/kishansagathiya/ec47bce650b926ff2a0c463d792080a6

@danforbes
Copy link
Contributor

Are we sure that this is an "atomic" problem or is it possible that it's just a symptom of some larger problem (e.g. trie encoding)?

@kishansagathiya
Copy link
Contributor Author

Are we sure that this is an "atomic" problem or is it possible that it's just a symptom of some larger problem (e.g. trie encoding)?

Don't know yet. Will let you know after more inspection, when this issue is picked up.

@kishansagathiya
Copy link
Contributor Author

@kishansagathiya
Copy link
Contributor Author

kishansagathiya commented Jun 20, 2022

substrate's block request

2022-06-20 18:04:34.675 TRACE tokio-runtime-worker sync: New block request for 12D3KooWQs5HydobYuhjPWq4YTCQzf9ZcygU3chasCSiYWmGD7Sr, (best:2, common:1) BlockRequest { id: 0, fields: HEADER | BODY | JUSTIFICATION, from: Hash(0x4a3b893afa51c5e64f638106f222fb61a469b40dae078f35351fce24c42a65e5), to: None, direction: Descending, max: Some(1) }    

substrate's block response

2022-06-20 18:04:34.678 TRACE tokio-runtime-worker sync: BlockResponse 0 from 12D3KooWQs5HydobYuhjPWq4YTCQzf9ZcygU3chasCSiYWmGD7Sr with 1 blocks  (1)    
2022-06-20 18:04:34.678 TRACE tokio-runtime-worker sync: Reversing incoming block list    
2022-06-20 18:04:34.678 DEBUG tokio-runtime-worker sync: Received block that was not requested. Requested Hash(0x4a3b893afa51c5e64f638106f222fb61a469b40dae078f35351fce24c42a65e5), got Some(Header { parent_hash: 0xee87cc7ac5e87460094ec4ba29bbf8f28fb7741b69e7edb142233c3dd990b365, number: 1, state_root: 0x86127a0f08feda1cf7d6fb59889a22d0fdb523d142629589958857f29c00a97f, extrinsics_root: 0x0f8c09ad18646f8ef15087c0f1a7e5ce6bb24cafc018d4312e79e2ede62813d3, digest: Digest { logs: [DigestItem::PreRuntime([66, 65, 66, 69], [3, 0, 0, 0, 0, 85, 26, 172, 24, 0, 0, 0, 0, 70, 49, 8, 215, 252, 224, 84, 49, 172, 241, 35, 181, 158, 110, 43, 225, 7, 85, 235, 245, 217, 53, 241, 210, 33, 213, 33, 157, 224, 158, 226, 117, 58, 232, 43, 22, 126, 106, 35, 87, 91, 218, 239, 179, 44, 247, 10, 20, 8, 100, 218, 195, 11, 173, 126, 45, 102, 234, 106, 189, 230, 114, 19, 5, 50, 93, 64, 95, 197, 69, 181, 51, 243, 43, 94, 239, 64, 113, 166, 23, 143, 143, 185, 219, 111, 242, 253, 227, 16, 251, 30, 235, 84, 5, 2, 14]), DigestItem::Consensus([66, 65, 66, 69], [1, 12, 212, 53, 147, 199, 21, 253, 211, 28, 97, 20, 26, 189, 4, 169, 159, 214, 130, 44, 133, 88, 133, 76, 205, 227, 154, 86, 132, 231, 165, 109, 162, 125, 1, 0, 0, 0, 0, 0, 0, 0, 142, 175, 4, 21, 22, 135, 115, 99, 38, 201, 254, 161, 126, 37, 252, 82, 135, 97, 54, 147, 201, 18, 144, 156, 178, 38, 170, 71, 148, 242, 106, 72, 1, 0, 0, 0, 0, 0, 0, 0, 144, 181, 171, 32, 92, 105, 116, 201, 234, 132, 27, 230, 136, 134, 70, 51, 220, 156, 168, 163, 87, 132, 62, 234, 207, 35, 20, 100, 153, 101, 254, 34, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), DigestItem::Seal([66, 65, 66, 69], [48, 76, 209, 208, 121, 87, 218, 58, 111, 233, 185, 210, 131, 11, 6, 8, 174, 49, 58, 171, 157, 77, 69, 17, 215, 32, 139, 134, 121, 23, 176, 103, 198, 23, 228, 139, 253, 82, 223, 144, 5, 232, 159, 102, 214, 105, 103, 156, 72, 131, 169, 86, 115, 178, 148, 19, 117, 38, 12, 104, 35, 140, 134, 132])] } }).    

So, substrate is asking for block number 2. Value of best in BlockRequest is 2 and I checked in the logs 0x4a3b893afa51c5e64f638106f222fb61a469b40dae078f35351fce24c42a65e5 is hash of 2nd block.
Block we are responding with block number 1.

@kishansagathiya kishansagathiya self-assigned this Jul 6, 2022
kishansagathiya added a commit that referenced this issue Jul 11, 2022
A block request contain start block, end block, direction and max size.
It could be the case that block between start and end are less than max
size. In such cases, so far we were pruning block from the end while
preparing a block response.

The correct way however would be to prune from the start, if the direction
is descending and prune from the end if the direction is ascending.

Since, we were pruning blocks the wrong way, we were preparing a
response which substrate was not expecting. And that used to result in
substrate dropping us.

This commit fixes that issue.

Fixes #2595
kishansagathiya added a commit that referenced this issue Jul 11, 2022
A block request contain start block, end block, direction and max size.
It could be the case that block between start and end are less than max
size. In such cases, so far we were pruning block from the end while
preparing a block response.

The correct way however would be to prune from the start, if the direction
is descending and prune from the end if the direction is ascending.

Since, we were pruning blocks the wrong way, we were preparing a
response which substrate was not expecting. And that used to result in
substrate dropping us.

This commit fixes that issue.

Fixes #2595
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants