Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

lock import error: Backend error: Can't canonicalize missing block number #2482889 when importing {BLOCK_HASH} (#2486985) #12613

Closed
2 tasks done
jasl opened this issue Nov 3, 2022 · 9 comments · Fixed by #12949

Comments

@jasl
Copy link
Contributor

jasl commented Nov 3, 2022

Is there an existing issue?

  • I have searched the existing issues

Experiencing problems? Have you tried our Stack Exchange first?

  • This is not a support question.

Description of bug

I submitted to StackExchange

2022-11-01 07:08:54 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:08:54 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:08:55 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629792 (11 peers), best: #2482944 (0x7272…aac6), finalized #405357 (0x2ba4…5c69), ⬇ 1.8MiB/s ⬆ 2.7kiB/s
2022-11-01 07:08:56 [Relaychain] ⚙️  Syncing 48.7 bps, target=#15134037 (30 peers), best: #9247717 (0x6478…a2cc), finalized #9247232 (0xd3a2…417d), ⬇ 1.2MiB/s ⬆ 132.9kiB/s
2022-11-01 07:09:00 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629792 (11 peers), best: #2482944 (0x7272…aac6), finalized #405357 (0x2ba4…5c69), ⬇ 8.8MiB/s ⬆ 3.3kiB/s
2022-11-01 07:09:01 [Relaychain] ⚙️  Syncing 43.5 bps, target=#15134037 (30 peers), best: #9247935 (0x9a21…f6fe), finalized #9247744 (0xc1c1…b50c), ⬇ 1.0MiB/s ⬆ 139.8kiB/s
2022-11-01 07:09:05 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:05 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:05 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629792 (11 peers), best: #2482944 (0x7272…aac6), finalized #405593 (0x89a5…6ff9), ⬇ 9.0MiB/s ⬆ 4.2kiB/s
2022-11-01 07:09:06 [Relaychain] ⚙️  Syncing 48.3 bps, target=#15134037 (30 peers), best: #9248177 (0xb440…3ff2), finalized #9247744 (0xc1c1…b50c), ⬇ 1.0MiB/s ⬆ 121.6kiB/s
2022-11-01 07:09:10 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629792 (13 peers), best: #2482944 (0x7272…aac6), finalized #405593 (0x89a5…6ff9), ⬇ 9.7MiB/s ⬆ 4.6kiB/s
2022-11-01 07:09:11 [Relaychain] ⚙️  Syncing 42.5 bps, target=#15134037 (30 peers), best: #9248390 (0xdbc3…aad9), finalized #9248256 (0xb48f…6a96), ⬇ 953.1kiB/s ⬆ 120.9kiB/s
2022-11-01 07:09:15 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629793 (11 peers), best: #2482944 (0x7272…aac6), finalized #405593 (0x89a5…6ff9), ⬇ 9.0MiB/s ⬆ 4.2kiB/s
2022-11-01 07:09:15 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:15 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:16 [Relaychain] ⚙️  Syncing 47.7 bps, target=#15134045 (30 peers), best: #9248629 (0x0da6…9e37), finalized #9248256 (0xb48f…6a96), ⬇ 1.0MiB/s ⬆ 130.5kiB/s
2022-11-01 07:09:20 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629793 (13 peers), best: #2482944 (0x7272…aac6), finalized #405829 (0x92bb…2cd7), ⬇ 5.7MiB/s ⬆ 7.7kiB/s
2022-11-01 07:09:21 [Relaychain] ⚙️  Syncing 43.3 bps, target=#15134046 (30 peers), best: #9248846 (0x4dff…ab5b), finalized #9248769 (0xeb85…0f4a), ⬇ 956.7kiB/s ⬆ 123.9kiB/s
2022-11-01 07:09:25 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629795 (13 peers), best: #2482944 (0x7272…aac6), finalized #405829 (0x92bb…2cd7), ⬇ 6.9MiB/s ⬆ 5.7kiB/s
2022-11-01 07:09:26 [Relaychain] ⚙️  Syncing 43.1 bps, target=#15134046 (30 peers), best: #9249062 (0xf167…12a9), finalized #9248769 (0xeb85…0f4a), ⬇ 889.9kiB/s ⬆ 140.6kiB/s
2022-11-01 07:09:30 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629798 (14 peers), best: #2482944 (0x7272…aac6), finalized #406070 (0x49b1…c151), ⬇ 9.4MiB/s ⬆ 5.5kiB/s
2022-11-01 07:09:31 [Relaychain] ⚙️  Syncing 45.6 bps, target=#15134046 (30 peers), best: #9249290 (0x88d5…5cf9), finalized #9249281 (0x2c85…36a1), ⬇ 1.1MiB/s ⬆ 146.0kiB/s
2022-11-01 07:09:35 [Parachain] ⚙️  Syncing  0.0 bps, target=#2629798 (15 peers), best: #2482944 (0x7272…aac6), finalized #406070 (0x49b1…c151), ⬇ 5.8MiB/s ⬆ 3.6kiB/s
2022-11-01 07:09:36 [Relaychain] ⚙️  Syncing 45.9 bps, target=#15134046 (30 peers), best: #9249520 (0x9522…56f7), finalized #9249281 (0x2c85…36a1), ⬇ 1000.6kiB/s ⬆ 112.3kiB/s
2022-11-01 07:09:38 [Parachain] Block import error: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:38 [Parachain] 💔 Error importing block 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead: consensus error: Import failed: Backend error: Can't canonicalize missing block number #2482945 when importing 0xa0000aacfc918561284868cba29427613320594d3b129dc5981b0946f8368ead (#2487041)
2022-11-01 07:09:38 [Parachain] 💔 Error importing block 0x27939286e6722c962d6c964e4bfc4a3371b2ac6d78f6b2ee534a6a1ad7543786: block has an unknown parent

I tried wait few hours and restart the node few times, the node can't recovery from the error.
I also tried delete the DB and do a re-sync, it still occur on another block.
I heard other people has this trouble too, so I think it's worthy to made a bug report

We're running node in--pruning archive mode, in my case I'm using ParityDB

Here's a log with -l db=trace,sync=trace
khala-node.filtered.log

Steps to reproduce

Start a new node and wait maybe a night can trigger

docker run -dti --name khala-node -e NODE_ROLE=MINER -v ~/khala-node:/root/data phalanetwork/khala-node
the command equivalent to khala-node --pruning archive -- --pruning archive

@pangwa
Copy link
Contributor

pangwa commented Nov 12, 2022

experienced the same error and a fresh sync doesn't help.

@bkchr
Copy link
Member

bkchr commented Nov 12, 2022

@pangwa so you can reproduce this every time?

@pangwa
Copy link
Contributor

pangwa commented Nov 12, 2022

@pangwa so you can reproduce this every time?
It happens twice.
My node is still syncing. The parachain node stuck at the block(e.g. 2182153 )where the block import error happens. the block finalization is still in progress and catching up block 2182153(1443366/2182153). I'll try to restart the node after the block finalization catches up to 2182153.

I can see errors after restart which reports below error.

Candidate included without being backed? candidate_hash=0x31858bec0ba9c286befe3fcaa1c8aa231f00707788d8411748a70c5e3b3fd5af traceID=65825585232036832991727261035394542115 Candidate included without being backed? candidate_hash=0xa9866a64d01277aa718e1bbda3899598e7eead9479aacbb98abbd5032e4ec730 traceID=225337456989323895739798477171113694616 Candidate included without being backed? candidate_hash=0xac925a2b8100c0881acf2430d2b93348c7aaf9d2e53f2e5efb2433ca7b2d3738 traceID=229387119479951407208932160026445427528

@bkchr
Copy link
Member

bkchr commented Nov 12, 2022

@pangwa could you run with -ldb=trace? We would need the logs around the failure.

@pangwa
Copy link
Contributor

pangwa commented Nov 12, 2022

@pangwa could you run with -ldb=trace? We would need the logs around the failure.

Hey, it looks like the node get synced after I performed a resync(deleting all the chain data and resync the node). The logs mentioned above seems to be harmless.

@jasl
Copy link
Contributor Author

jasl commented Nov 12, 2022

@bkchr I have the log, do you find any interesting?

@bkchr
Copy link
Member

bkchr commented Nov 12, 2022

@bkchr I have the log, do you find any interesting?

I totally had overseen this, looked into the logs and didn't see anything useful right now.

@andabak
Copy link

andabak commented Nov 24, 2022

Hi. Any updates/estimates on this issue? We believe this issue causes our nodes not being able to sync (Astar). Thanks!

bkchr added a commit that referenced this issue Dec 15, 2022
There is this issue about missing block numbers on forced canonicalization. I looked over the code
now 10000 times and there are possible ways this can be triggered, but I don't really know how this
is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash
mapping is set when we import a new best block. Forced canonicalization will now stop at the best
block and it will canonicalize the other blocks later when the best block moved. As the error
reports indicated that this issue mainly happened on major sync, there should not be any forks, so
not doing the canonicalization directly shouldn't be that harmful. All known implementations should
import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I
didn't yet found it).

I will also do some changes to Cumulus around some potential culprit for this issue.

Closes: #12613
paritytech-processbot bot pushed a commit that referenced this issue Dec 16, 2022
* Fix missing block number issue on forced canonicalization

There is this issue about missing block numbers on forced canonicalization. I looked over the code
now 10000 times and there are possible ways this can be triggered, but I don't really know how this
is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash
mapping is set when we import a new best block. Forced canonicalization will now stop at the best
block and it will canonicalize the other blocks later when the best block moved. As the error
reports indicated that this issue mainly happened on major sync, there should not be any forks, so
not doing the canonicalization directly shouldn't be that harmful. All known implementations should
import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I
didn't yet found it).

I will also do some changes to Cumulus around some potential culprit for this issue.

Closes: #12613

* Add some docs

* Fix fix

* Review comments

* Review comments
@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/polkadot-release-analysis-v0-9-37/1736/1

ltfschoen pushed a commit to ltfschoen/substrate that referenced this issue Feb 22, 2023
…#12949)

* Fix missing block number issue on forced canonicalization

There is this issue about missing block numbers on forced canonicalization. I looked over the code
now 10000 times and there are possible ways this can be triggered, but I don't really know how this
is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash
mapping is set when we import a new best block. Forced canonicalization will now stop at the best
block and it will canonicalize the other blocks later when the best block moved. As the error
reports indicated that this issue mainly happened on major sync, there should not be any forks, so
not doing the canonicalization directly shouldn't be that harmful. All known implementations should
import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I
didn't yet found it).

I will also do some changes to Cumulus around some potential culprit for this issue.

Closes: paritytech#12613

* Add some docs

* Fix fix

* Review comments

* Review comments
ark0f pushed a commit to gear-tech/substrate that referenced this issue Feb 27, 2023
…#12949)

* Fix missing block number issue on forced canonicalization

There is this issue about missing block numbers on forced canonicalization. I looked over the code
now 10000 times and there are possible ways this can be triggered, but I don't really know how this
is triggered. So, this pr is going to solve the symptom and not the cause. The block number to hash
mapping is set when we import a new best block. Forced canonicalization will now stop at the best
block and it will canonicalize the other blocks later when the best block moved. As the error
reports indicated that this issue mainly happened on major sync, there should not be any forks, so
not doing the canonicalization directly shouldn't be that harmful. All known implementations should
import all blocks as best block on major sync anyway (I mean somewhere there is the bug, but I
didn't yet found it).

I will also do some changes to Cumulus around some potential culprit for this issue.

Closes: paritytech#12613

* Add some docs

* Fix fix

* Review comments

* Review comments
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants