-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add a optional json dump state to evm-bin #9706
Conversation
Also fixes json-test output on finish, and allow to put both on err or out (--out-only and --err-only).
It looks like this contributor signed our Contributor License Agreement. 👍 Many thanks, Parity Technologies CLA Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only did a preliminary review, I don't see why we can't merge this as a temporary solution for jsontests debug, but I would advice we do it with real caution (for example, hide it under a config flag in ethcore, or other non-invasive method!).
The issue is that State::to_pod
is actually broken! Currently as it is implemented, it will not print out all accounts in the state, but only accounts in the cache. I can easily imagine this leads to plenty of confusions whenever we had any state root mismatch, and we left the debuggers in the dark wondering why the hell that account doesn't appear in the state dump!
State::to_pod
is only used by State::to_pod_diff
, meaning we cannot change its functionality in the place. One possible way to make this better:
- Remove
pub
fromState::to_pod
-- it shouldn't be public IMHO! - Create another (slower) method in
State
that properly prints out all accounts in the state. This requires iterating over the trie, but we have all associated methods. Or we can also extendClient::list_accounts
and make the count limit optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. (Left a comment for an issue in to_pod_slow
!) A few grumbles.
Cargo.toml
Outdated
@@ -108,6 +108,8 @@ memory_profiling = [] | |||
# in order to manually test that parity fall-over to the local version | |||
# in case of invalid or deprecated command line arguments are entered | |||
test-updater = ["parity-updater/test-updater"] | |||
# Allow functionalities for evmbin or test only. | |||
evmbin-only = ["ethcore/evmbin-only"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this feature flag really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No just a reexport, can remove.
ethcore/Cargo.toml
Outdated
@@ -110,3 +112,5 @@ test-heavy = [] | |||
benches = [] | |||
# Compile test helpers | |||
test-helpers = ["tempdir"] | |||
# Slow functionalities for evmbin or test only. | |||
evmbin-only = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nitpick -- suggest to call it something more direct, or maybe just to-pod-slow
, or to-pod-full
.
ethcore/src/state/mod.rs
Outdated
|
||
/// Only for use with tests (to slow for real use), | ||
/// panic if feature 'evmbin-only' is not activated. | ||
#[cfg(not(feature="evmbin-only"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need an explicit panic here? We should be able to just remove this section all together if to_pod_slow
is indeed not used elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only way, is to move call in evmbin, makes thing a bit more complicated. In fact it removes the need for a 'do_dump' fn in informant trait, it is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't get you on this part. Isn't evmbin always built with this feature on? And if this function is not used anywhere else, I think we can remove it without getting any compiling errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function was call from EvmTestClient (in ethcore module). EvmTestClient is used in other context than evmbin only (json test only at the time, plus libfuzzer, and anything because public). And the function call depend on a trait function (could not be in the evmbin informant trait cause this one is dropped at the time of the function call). So I needed an alternate implementation to compile. Anyway, with last commit it is not needed anymore.
ethcore/src/state/mod.rs
Outdated
@@ -939,8 +939,7 @@ impl<B: Backend> State<B> { | |||
} | |||
} | |||
|
|||
/// Populate a PodAccount map from this state. | |||
pub fn to_pod(&self) -> PodState { | |||
fn cache_to_pod(&self) -> PodState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep the docs /// Populate a PodAccount map from this state.
.
ethcore/src/state/mod.rs
Outdated
#[cfg(feature="evmbin-only")] | ||
/// Populate a PodAccount map from this state. | ||
/// Warning this is not for real time use | ||
pub fn to_pod_slow(&self) -> PodState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick -- maybe name this something more about its functionality to_pod_full
or others.
@@ -952,8 +951,58 @@ impl<B: Backend> State<B> { | |||
})) | |||
} | |||
|
|||
#[cfg(feature="evmbin-only")] | |||
/// Populate a PodAccount map from this state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docs here that this requires FatDB.
ethcore/wasm/run/src/runner.rs
Outdated
@@ -66,7 +66,7 @@ impl Fail { | |||
} | |||
|
|||
impl fmt::Display for Fail { | |||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | |||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo -- looks like incorrect indentation.
evmbin/src/info.rs
Outdated
/// Display final result. | ||
fn finish(result: RunResult<Self::Output>); | ||
fn finish(result: RunResult<Self::Output>, &mut Self::Sink); | ||
/// Get trie spec to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is wrong here.
ethcore/src/state/mod.rs
Outdated
|
||
for item in iter { | ||
if let Ok((addr, dbval)) = item { | ||
let mut account = from_rlp(&dbval[..]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right IMO -- what this does is to fetch everything from the trie, so if we didn't call State::commit
before State::to_pod_slow
, then this will give us incorrect results.
Maybe we can just fetch all addresses here, combine it with addresses from cache of the state, and then use the same method as in to_pod_cache
to get the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it needs a previous 'commit' call to be correct, let's make cache then trie version.
ethcore/src/state/mod.rs
Outdated
@@ -939,8 +939,7 @@ impl<B: Backend> State<B> { | |||
} | |||
} | |||
|
|||
/// Populate a PodAccount map from this state. | |||
pub fn to_pod(&self) -> PodState { | |||
fn cache_to_pod(&self) -> PodState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick -- either name two functions to_pod_cache
and to_pod_full
, or cache_to_pod
and full_to_pod
.
Use cache first in 'to_pod_full', for in between commits case. Change dump activation to use a function pointer instead.
ethcore/src/state/mod.rs
Outdated
result.entry(Address::from_slice(&addr)).or_insert_with(|| { | ||
let mut account = from_rlp(&dbval[..]); | ||
if account.code_size().is_none() { | ||
let addr_hash = account.address_hash(&H160::from(&addr[..])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this -- this will also not completely work -- for a fresh account object, we would miss all storage items! And I do believe we have plenty of tests that deal with storage items. So I think we would also need to fix this.
We do already have logics in cli::blockchain::execute_export_state
that do this. You can refer to its logic, and I would recommend one of the two ways to deal with this.
Refactor Client::list_accounts
and Client::list_storage
Currently they only work for real client, but not EVM test client, but I believe it would be possible to extend this interface, and change the count
argument to accept Option
. With this, we change the signature to to_pod_full(&self, accounts: HashMap<Address, Vec<H256>>)
, where for the hashmap, keys are account addresses, and values are existing storage items.
Within this function, call State::require
for all addresses, and call State::storage_at
for all storage items. After that, we can use State::to_pod_cache
safely to get all addresses.
Simpler without Refactoring
Simply use the same logic as Client::list_accounts
and Client::list_storage
, but copy them to State
, gather HashMap<Address, Vec<H256>>
there. And use State::require
, State::storage_at
, and State::to_pod_cache
.
this case to reuse existing code. Reuse of `storage_at` was not easy in this case (could not iterate and use the method at the same time (refcell mutable borrow panics) so keeping code as is.
|
||
let mut pod_account = PodAccount::from_account(&account); | ||
// cached one first | ||
pod_storage.append(&mut pod_account.storage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use self.storage_at
instead. The issue is that what we have in trie and what we have in cache might differ, and we need to use the value in cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse of
storage_at
was not easy in this case (could not iterate and
use the method at the same time (refcell mutable borrow panics) so
keeping code as is.
Would you mind to try this:
- Collect all storage items from trie first in a
Vec<H256>
. - Call
storage_at
at the end of the round.
Not sure whether that would solve the lifetime issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this will works (but I kinda wanted to avoid that :) ), I will switch to this version for the sake of using similar code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use self.storage_at instead. The issue is that what we have in trie and what we have in cache might differ, and we need to use the value in cache.
Doesn't the merge ('append') of both storage do that? (pod_account got values from storage_change replacing value read from trie). Can value from 'original_storage_cache' of 'Account' differs from trie value (this field is lost in the process) ?
Switching to 'storage_at' anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cheme Ah sorry. Yeah you're right. PodAccount::from
would only print things on storage_changes
, so using State::storage_at
won't be sufficient for our case. I think the current method works.
This comment has been minimized.
This comment has been minimized.
@sorpaas , I did test your code, the problem is that 'storage_at' just update the lru cache, and when using 'to_pod_cache' -> 'PodAccount::from_account' will only display 'storage_changes', so we do not have any storage displayed.
and with 'storage_at', I got:
So I am wondering, do we want to display all storage value or only the changed one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality looks good to me. A few grumbles. Also would appreciate some simple tests!
ethcore/src/state/mod.rs
Outdated
let addr_hash = account.address_hash(address); | ||
let accountdb = self.factories.accountdb.readonly(self.db.as_hashdb(), addr_hash); | ||
|
||
if account.code_size().is_none() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this case possible? Previously we called State::require
with require_code to be true. So code_size cannot be None?
Should we change this to an assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like if this ever happens, then it means previously we had a DB failure caching the code. So below we shouldn't call cache_code
again. Would suggest we either change this to just skip code and print an warning or return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, seems impossible, it is a remnant from my previous code where I did not use 'require'.
ethcore/src/state/mod.rs
Outdated
} | ||
|
||
// Resolve missing part | ||
for (add, opt) in self.cache.borrow_mut().iter_mut() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to borrow a mutable reference here? I don't think we're changing anything in the cache from this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing, it is from previous code that did not use qrequire', will be removed.
ethcore/src/state/mod.rs
Outdated
PodState::from(result) | ||
} | ||
|
||
fn account_to_pod_account(&self, account: &mut Account, address: &Address) -> PodAccount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be #[cfg(feature="to-pod-full")]
as well. Also would be great if we can include a short-line docs describing its behavior (merging all storage items into pod account).
ethcore/src/state/mod.rs
Outdated
} | ||
|
||
account.storage_root() | ||
.map(|root| self.factories.trie.readonly(accountdb.as_hashdb(), &root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't quite get why we need map
here. I think we can use and_then
or similar if we're not interested in the return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change to returning Result, it makes things looks a bit nicer.
@cheme Yeah you're right. I was wrong previously -- we cannot reuse |
And indeed we want to display all storage values just to avoid any confusion. |
ethcore/src/state/mod.rs
Outdated
let _ = account.cache_code(accountdb.as_hashdb()); | ||
} | ||
|
||
account.storage_root() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use base_storage_root
here.
Use 'base_storage_root' instead of 'storage_root'. Added a test, it will only execute with json-test in ci, or when launch with the feature.
ethcore/Cargo.toml
Outdated
@@ -110,3 +112,5 @@ test-heavy = [] | |||
benches = [] | |||
# Compile test helpers | |||
test-helpers = ["tempdir"] | |||
# Slow method to-pod-full for evmbin or test only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Enables slow to_pod_full
method for use in tests and evmbin"?
/// Creates new EVM test client with in-memory DB initialized with genesis of given Spec. | ||
pub fn new(spec: &'a spec::Spec) -> Result<Self, EvmTestError> { | ||
let factories = Self::factories(); | ||
/// You can specify a specific trie kind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Takes a TrieSpec
to set the type of trie: Generic, Fat, Secure"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am skipping the different type of trie (chance of leading to outdated comment in case of change on the variants).
}) | ||
} | ||
|
||
/// Creates new EVM test client with in-memory DB initialized with genesis of given Spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/with in-memory/with an in-memory/
("…of given kind": but the code says it's using the Secure
kind so I think the doc needs to change a bit)
/// Creates new EVM test client with in-memory DB initialized with given PodState. | ||
pub fn from_pod_state(spec: &'a spec::Spec, pod_state: pod_state::PodState) -> Result<Self, EvmTestError> { | ||
let factories = Self::factories(); | ||
/// You can specify a specific trie kind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Takes a TrieSpec
to set the type of trie: Generic, Fat, Secure"
}) | ||
} | ||
|
||
fn factories() -> Factories { | ||
/// Creates new EVM test client with in-memory DB initialized with given PodState. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/with in-memory/with an in-memory/
ethcore/src/state/mod.rs
Outdated
/// Create a PodAccount from an account. | ||
/// Differs from existing method by adding all storage | ||
/// values of the account to the PodAccount. | ||
/// This function shall therefore only be used for small |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "This function is only intended for use in small tests or with fresh accounts"?
ethcore/src/state/mod.rs
Outdated
let trie = self.factories.trie.readonly(accountdb.as_hashdb(), &root)?; | ||
for o_kv in trie.iter()? { | ||
if let Ok((key, val)) = o_kv { | ||
pod_storage.insert(H256::from(&key[..]), H256::from(U256::from(&val[..]))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can do pod_storage.insert(&key[..].into(), U256::from(&val[..]).into());
if you declare the pod_storage
with BTreeMap::new::<H256, H256>::new()
?
balance: U256::zero(), | ||
nonce: U256::zero(), | ||
code: Some(Default::default()), | ||
storage: vec![(H256::from(&U256::from(1u64)), H256::from(&U256::from(20u64)))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you have to pass over U256
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure at all, copy paste from other tests :) But it makes me understand a previous conversation on riot about casting usize to H256 : it appends 0 at the end of the array, so we need the intermediate U256 casting to get the expected value (I got a bug from this at first).
Personally I would make the cast from usize to H256 preppends 0, it is a really big source of error.
evmbin/src/info.rs
Outdated
TransactResult::Ok { state_root, gas_left, output, vm_trace, end_state, .. } => { | ||
if state_root != post_root { | ||
(Err(EvmTestError::PostCondition(format!( | ||
"State root mismatch (got: 0x{:x}, expected: 0x{:x})", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use {:#x}
here?
evmbin/src/main.rs
Outdated
@@ -89,6 +89,9 @@ State test options: | |||
General options: | |||
--json Display verbose results in JSON. | |||
--std-json Display results in standardized JSON format. | |||
--dump-json Display result state dump in standardized JSON format. | |||
--err-only With --std-json redirect to err output only. | |||
--out-only With --std-json redirect to out output only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is a dumb idea, but there's a risk users get it wrong and use --err-only
without --dump-json
so we could do
--dump-json … …
--dump-json-err-only … …
--dump-json-out-only … …
i.e. have the three options mutually exclusive instead of combining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact dump-json is a variant of std-json, so it would be --std-json-err-only. It is not really coherent, I will switch to
--std-json
--std-err-only
--std-out-only
--std-dump-json Display result in standardized JSON format with additional state dump.
Comments fixes. Minor code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
ethcore/src/pod_state.rs
Outdated
@@ -26,7 +26,7 @@ use types::state_diff::StateDiff; | |||
use ethjson; | |||
|
|||
/// State of all accounts in the system expressed in Plain Old Data. | |||
#[derive(Debug, Clone, PartialEq, Eq, Default)] | |||
#[derive(Debug, Clone, PartialEq, Eq, Default, Serialize)] | |||
pub struct PodState (BTreeMap<Address, PodAccount>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, but there's an extra space between PodState
and (BTreeMap
.
for (add, opt) in self.cache.borrow().iter() { | ||
if let Some(ref acc) = opt.account { | ||
let pod_account = self.account_to_pod_account(acc, add)?; | ||
result.insert(add.clone(), pod_account); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Address
is copy, so we can just do *add
here.
Could you give this a final review @dvdplm @niklasad1 |
@cheme please re-base |
Could you give this a final review @dvdplm @niklasad1 @tomusdrw |
This PR allows evm-bin to produce a json dump of the end state.
This is activated by using '--std-dump-json' which is a variant of '--std-json'.
This feature was asked for on ethereum/fuzztests gitter by @cdetrio and @holiman and geth related pr is : ethereum/go-ethereum#17832 .
Two other variants are added for '--std-json' to allow some flexibility on the output streams : '--std-err-only' and '--std-out-only' that either puts everything on err or on out, I think it is one of the point of #9408 .
The produce json for :
parity-evm state-test --std-json --std-dump-json --std-out-only ../../randomtests/Constantinople.minified.DEFAULT-Wed_13_52_03-13586-34149-test.json
is :Note for reviewer: