Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

try-runtime-cli panics on Polkadot #14006

Closed
ggwpez opened this issue Apr 25, 2023 · 11 comments · Fixed by #14057
Closed

try-runtime-cli panics on Polkadot #14006

ggwpez opened this issue Apr 25, 2023 · 11 comments · Fixed by #14057
Assignees
Labels
U2-some_time_soon Issue is worth doing soon.

Comments

@ggwpez
Copy link
Member

ggwpez commented Apr 25, 2023

On commit Polkadot 3595684b0293439c32dfe563008958c7d0091289 ran:

cargo b -r --bin polkadot  --features try-runtime

RUST_LOG=runtime=debug,remote-ext=trace target/release/polkadot  try-runtime --detailed-log-output --chain polkadot-dev --runtime target/release/wbuild/polkadot-runtime/polkadot_runtime.wasm on-runtime-upgrade --checks=pre-and-post live --uri wss://polkadot-try-runtime-node.parity-chains.parity.io:443

Truncated output:

DEBUG ThreadId(47) remote-ext: Remaining payloads: 36003 Batch request size: 2435    
 INFO main remote-ext: inserting keys progress = 3% [50000 / 1843038]    
DEBUG ThreadId(35) remote-ext: Remaining payloads: 8902 Batch request size: 11787    
DEBUG ThreadId(51) remote-ext: Batch request failed, trying again with smaller batch size. Networking or low-level protocol error: The request body was too large    
DEBUG ThreadId(51) remote-ext: Remaining payloads: 24471 Batch request size: 2679    
DEBUG ThreadId(45) remote-ext: Remaining payloads: 8902 Batch request size: 11787    
DEBUG ThreadId(36) remote-ext: Remaining payloads: 8902 Batch request size: 11787    
 INFO main remote-ext: inserting keys progress = 5% [100000 / 1843038]    
DEBUG ThreadId(32) remote-ext: Remaining payloads: 8902 Batch request size: 11787    
 INFO main remote-ext: inserting keys progress = 8% [150000 / 1843038]    
 INFO main remote-ext: inserting keys progress = 11% [200000 / 1843038]    
 INFO main remote-ext: inserting keys progress = 14% [250000 / 1843038]    
DEBUG ThreadId(48) remote-ext: Remaining payloads: 58582 Batch request size: 1215    
 INFO main remote-ext: inserting keys progress = 16% [300000 / 1843038]    

====================

Version: 0.9.41-3595684b029

   0: sp_panic_handler::set::{{closure}}
   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:2001:9
      std::panicking::rust_panic_with_hook
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:696:13
   2: std::panicking::begin_panic::{{closure}}
   3: std::sys_common::backtrace::__rust_end_short_backtrace
   4: std::panicking::begin_panic
   5: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
   6: <futures_util::future::future::flatten::Flatten<Fut,<Fut as core::future::future::Future>::Output> as core::future::future::Future>::poll
   7: <futures_util::future::either::Either<A,B> as core::future::future::Future>::poll
   8: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
   9: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
  10: hyper::client::client::Client<C,B>::retryably_send_request::{{closure}}
  11: <hyper::client::client::ResponseFuture as core::future::future::Future>::poll
  12: <tokio::time::timeout::Timeout<T> as core::future::future::Future>::poll
  13: <jsonrpsee_http_client::client::HttpClient as jsonrpsee_core::client::ClientT>::batch_request::{{closure}}
  14: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  15: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  16: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  17: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  18: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  19: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  20: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  21: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  22: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  23: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  24: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  25: tokio::runtime::park::CachedParkThread::block_on
  26: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
  27: tokio::runtime::runtime::Runtime::block_on
  28: std::sys_common::backtrace::__rust_begin_short_backtrace
  29: core::ops::function::FnOnce::call_once{{vtable.shim}}
  30: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
      std::sys::unix::thread::Thread::new::thread_start
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys/unix/thread.rs:108:17
  31: <unknown>
  32: <unknown>


Thread '<unnamed>' panicked at 'dispatch dropped without returning error', /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.20/src/client/conn.rs:329

This is a bug. Please report it at:

        https://github.com/paritytech/polkadot/issues/new

Need this for testing a change. Full log from the CI.

@ggwpez ggwpez added the U2-some_time_soon Issue is worth doing soon. label Apr 25, 2023
@ggwpez
Copy link
Member Author

ggwpez commented Apr 25, 2023

It is doing 94 recursions of get_storage_data_dynamic_batch_size on Westend. Should maybe re-write this to use a loop to prevent an overflow.

@kianenigma
Copy link
Contributor

Seems like some of the parallelization added in #12537 is causing more issues than helping :(

@ggwpez
Copy link
Member Author

ggwpez commented Apr 25, 2023

Seems like some of the parallelization added in #12537 is causing more issues than helping :(

Yea not sure how much it helped with the speed, came up in the review here #13923 (comment)

But the panic is not from your parallel change, but rather also #13923. We should add a test that it does not break in Polkadot 😅

@liamaharon
Copy link
Contributor

liamaharon commented Apr 25, 2023

Hey thanks for raising this.

I had been testing try-runtime against a local Polkadot node on my laptop without any issues, I wasn't aware a CI env existed that needs testing against. I'm looking into this now.

@niklasad1
Copy link
Member

niklasad1 commented Apr 25, 2023

hyperium/hyper#2112

Try creating a new client in the "batch thread" instead of cloning should fix that...

Here

let thread_client = client.clone();
because it starts another tokio runtime in the new thread which might fuck up the hyper background thread/task because it is started on another runtime ^^

ggwpez added a commit to paritytech/polkadot that referenced this issue Apr 25, 2023
... until paritytech/substrate#14006 is done.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
paritytech-processbot bot pushed a commit to paritytech/polkadot that referenced this issue Apr 26, 2023
* Companion for substrate

* Adds kusama migration

* update lockfile for {"substrate"}

* Temp allow try-runtime to fail

... until paritytech/substrate#14006 is done.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: parity-processbot <>
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
@ggwpez
Copy link
Member Author

ggwpez commented Apr 28, 2023

Yea this still fails, so i cant test #13417

@niklasad1
Copy link
Member

The same error after #14030? Then we need what I wrote in the PR

@ggwpez
Copy link
Member Author

ggwpez commented Apr 28, 2023

Yea looks like the same error

BT

2023-04-28 08:51:43.840 DEBUG ThreadId(48) remote-ext: Remaining payloads: 57361 Batch request size: 1336                                                                                                            
2023-04-28 08:51:44.178  INFO main remote-ext: inserting keys progress = 24% [450000 / 1842910]                                                                                                                      
                                                                                                                                                                                                                     
====================                                                                                                                                                                                                 
                                                                                                                                                                                                                     
Version: 0.9.41-3ec08c0ac34                                                                                                                                                                                          
                                                                                                                                                                                                                     
   0: sp_panic_handler::set::{{closure}}                                                                                                                                                                             
   1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call                                                                                                                                                
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:2001:9                                                                                                                    
      std::panicking::rust_panic_with_hook                                                                                                                                                                           
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:696:13                                                                                                                  
   2: std::panicking::begin_panic::{{closure}}                                                                                                                                                                       
   3: std::sys_common::backtrace::__rust_end_short_backtrace                                                                                                                                                         
   4: std::panicking::begin_panic                                                                                                                                                                                    
   5: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll                                                                                                                          
   6: <futures_util::future::future::flatten::Flatten<Fut,<Fut as core::future::future::Future>::Output> as core::future::future::Future>::poll                                                                                                                                   
   7: <futures_util::future::either::Either<A,B> as core::future::future::Future>::poll          
   8: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll                                              
   9: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll        
  10: hyper::client::client::Client<C,B>::retryably_send_request::{{closure}}                                                            
  11: <hyper::client::client::ResponseFuture as core::future::future::Future>::poll
  12: <tokio::time::timeout::Timeout<T> as core::future::future::Future>::poll                                                           
  13: <jsonrpsee_http_client::client::HttpClient as jsonrpsee_core::client::ClientT>::batch_request::{{closure}}                                                                                                                                                                  
  14: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}                                                                                                                       
  15: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  16: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  17: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  18: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  19: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  20: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  21: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}                                                                                                                       
  22: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  23: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  24: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  25: frame_remote_externalities::Builder<B>::get_storage_data_dynamic_batch_size::{{closure}}
  26: tokio::runtime::park::CachedParkThread::block_on                                        
  27: tokio::runtime::scheduler::multi_thread::MultiThread::block_on                          
  28: tokio::runtime::runtime::Runtime::block_on                                                          
  29: std::sys_common::backtrace::__rust_begin_short_backtrace                                
  30: core::ops::function::FnOnce::call_once{{vtable.shim}}                                   
  31: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once                
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once                
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
      std::sys::unix::thread::Thread::new::thread_start             
             at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys/unix/thread.rs:108:17                                
  32: <unknown>                                                                                           
  33: <unknown>                                                                                           
                                                                                                          
                                                                                                          
Thread '<unnamed>' panicked at 'dispatch dropped without returning error', /home/ubuntu/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.20/src/client/conn.rs:329

Now after a re-run it does work. so probably something flaky…

@liamaharon
Copy link
Contributor

Thanks for the logs. I'm going to prioritise refactoring this to remove the redundant multi-threading, which should resolve this issue.

@ggwpez
Copy link
Member Author

ggwpez commented Apr 28, 2023

It also happens with TRY_RUNTIME_MAX_THREADS=1 🤷‍♂️

@niklasad1
Copy link
Member

No, the problem is that the runtime that created client is closed when the background task tries to perform the RPC call ^^

The number of threads shouldn't matter

ggwpez added a commit to ggwpez/runtimes that referenced this issue Jul 13, 2023
* Companion for substrate

* Adds kusama migration

* update lockfile for {"substrate"}

* Temp allow try-runtime to fail

... until paritytech/substrate#14006 is done.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: parity-processbot <>
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
U2-some_time_soon Issue is worth doing soon.
Projects
Status: Done
4 participants