Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement EIP2718 and EIP2930 #1048

Merged
merged 32 commits into from
Mar 4, 2021
Merged

Implement EIP2718 and EIP2930 #1048

merged 32 commits into from
Mar 4, 2021

Conversation

jochem-brouwer
Copy link
Member

@jochem-brouwer jochem-brouwer commented Jan 17, 2021

This PR intends to implement EIP2718 and EIP2930 which are scheduled for YoloV3/Berlin.

I have decided to just dump the progress here and it is extremely WIP and at this point also extremely messy.

I started around trying to come up with a good design but kept running into problems. I've decided to create a first proof of concept and from that point identify which things we should change to come up with a consistent design. I've also created this PR so I can dump my thoughts on (mainly) the Tx implementation.

It is likely extremely hard or even impossible to make this PR backwards compatible, but it is worth giving it a try.

Here are some things I noticed:

  • It would make a lot of sense to reject certain behavior on a Transaction which is not signed. For instance, we can call hash() on a non-signed Transaction, but this makes no sense and actually returns a garbage hash (we could change this to return a raw hash which we would use to sign - that would make some sense). I've also considered to create two classes; a SignedTransaction and an UnsignedTransaction, but I'm not really sure here as this means that for every transaction type which we introduce we always have to introduce two classes (the signed one could extend the unsigned one though..?)
  • We should come up with some structure which places the data which we have to use at a single place. For instance, the raw() and the hash() method use the same copied code it seems. I tried creating an abstract BaseTransaction class which enforces some common logic, but the class itself would then have to implement the raw buffers to sign/hash/rlp-encode/etc.

Please do not review this yet, it is very messy at this point. I will use this PR to keep track of any stuff I notice along the way. This PR will likely end up to be a refactor of Tx which is hopefully backwards-compatible. If that's not the case, we can keep this PR and keep rebasing it until we decide we want to release a new Tx version.

Handy: Go-Ethereum implementation (also WIP)

Extra decisions made on encoding transactions

The intention is that this PR pushes breaking changes to the Tx package. It also refactors the Tx package. It intends to explicitly distinguish between unsigned and signed transactions.

TODOs:

  • Make tx types a bit more consistent.
  • Write docs, and especially update what hash(), serialize(), raw() of Tx are supposed to produce
  • Integrate EIP2930 in VM
  • Add transaction hash tests, for both unsigned and signed transaction

@jochem-brouwer
Copy link
Member Author

In general if I'd design this from scratch I would create a TransactionFactory which either returns a LegacyTransaction (see EIP2718 for the definition - LegacyTransaction is essentially the current Transaction class) or a TypedTransaction. We'd have to come up with some common interface for both of these since TypedTransaction can be arbitrary. They would at least require hash, sign and verify methods. The VM would then have to change behavior based upon what Transaction type we are using. This EIP2718 seems rather easy, but it seems very complex to implement as we have to change types everywhere and change behavior based upon the Transaction type.

@codecov
Copy link

codecov bot commented Jan 23, 2021

Codecov Report

Merging #1048 (8e02910) into master (93c1110) will increase coverage by 1.02%.
The diff coverage is 66.66%.

Impacted file tree graph

Flag Coverage Δ
block ?
blockchain ?
client 87.04% <ø> (-0.04%) ⬇️
common ?
devp2p ?
ethash ?
tx ?
vm 82.00% <66.66%> (-1.22%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@jochem-brouwer
Copy link
Member Author

Just temporarily fixed a circular dependency problem in bf63c74. It does move two classes within one file, which I had seperated out earlier. I think it is solvable in two files. The problem was that the Signed* tx variants imported their Unsigned* variant (because Signed* extends Unsigned*), but Unsigned* also imports their Signed* variants, because their static constructors return the right type based on the input.

@jochem-brouwer
Copy link
Member Author

There's something which I find rather weird, which I stumbled upon this refactor. In raw(), which we use to serialize() a transaction, we use empty buffers for the v,r,s values in case the tx is not signed. Why? It seems more logical to me to not include these buffers if the tx is not signed. Or is this part of some sort of consensus thing? Worth researching.

@jochem-brouwer
Copy link
Member Author

Another thing: we can check if the transaction has enough "upfront cost" (checks if gas limit is at least the cost to actually execute the transaction: the data fee plus the base fee). We can check this in the constructor (but we don't do this yet). Would it not make sense to verify this immediately in the constructor and throw if the provided limit is incorrect?

"d": "Gas cost per storage key in an Access List transaction"
}
},
"requiredEIPs": [2718, 2929],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nit: such a field such be defined higher in the parameter list - after minimumHardfork e.g., and not between this gas and config parameter dictionaries.

Apart from that: such a parameter has the problem that this doesn't include (at least in the implementation presented here) EIPs included in (encapsulated by) hardforks. So this only works if the dependent EIPs are then at some point included in the same HF.

I would prefer if we give this some more thought before we introducing such an option here (or maybe we directly find the solution along this discussion), so that this would also work if e.g. the dependent EIPs are added to a separate HF (so here: 2718, 2929).

This whole question here in the library of the relation of HFs and EIPs is generally a complex one and needs some careful thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, I know this is a bit a side track and you are interested in more general feedback 😄 , just since we are already on it)

Update: so this is highly connected to the two HF file formats with the new format now being used for the first time for berlin and allowing to destructure the hardfork files and place the parameters in the respective EIP files (instead of clustering them in the HF file).

Maybe we'll target to continue this for the other HFs relatively soon, since this is generally the cleaner solution and allows for more flexibility in using this library? 🤔 For this use case here we can then also iterate through the HF definitions and see if the EIPs being required are present either in the eips parameter or in one of the active HFs?

With this perspective ahead I would say we can then also eventually leave the implementation here since - in this case - all EIPs affected will be integrated in the same HF. And latest with including the final version of berlin we can do this HF file decomposition.

Thoughts?

const common = transactionOptions.common ?? DEFAULT_COMMON
if (rawData[0] <= 0x7f) {
// It is an EIP-2718 Typed Transaction
if (!common.eips().includes(2718)) {
Copy link
Member

@holgerd77 holgerd77 Jan 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit related: we should also move away from these checks (I think they are also in other places of our codebase, think the VM?) and rather use a newly to be introduced function in Common, I would suggest: common.isActiveEIP().

This function should check if an EIP is activated either by eips param or by HF (since semantically this makes absolutely no difference).

@jochem-brouwer maybe as a first step you can add such a function, but for now do the same comparison internally (with eips().includes(...)) as above? Then we have this at least already encapsulated and can later on add the HF-based EIP checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep this makes a lot of sense!

@holgerd77
Copy link
Member

Hi @jochem-brouwer, just had a first look, thanks for this excellent base work on such a wide ranging topic! 😀 🚀 With these preparations we should really be able to finish this in a structurally clean way providing a solid ground for working with transaction types in the years ahead, cool. 👍

The main concern I have based on the current state of the implementation: while transaction types theoretically and by EIP are conceptualized as being opaque (so they can contain any behavior, properties,...) in practice the new EIP-2930 transaction still shares 80%+ of its behavior with the legacy transactions. I would therefore have a strong tendency to give both these classes a shared base class, otherwise we copy far too much code. This could then also serve as some interface (abstract base class) where shared function signatures are defined and we can use this (hopefully) for typing on a lot of consumer functionality, otherwise we would have to do a lot of switches in the rest of the code base like (if txType === x use tx function a); if txType === y use tx function b)). @jochem-brouwer what are your thoughts here, were your questions on abstract classes in the chat also targeted on solving stuff around this? What do the others think?

I think since these updates will likely strongly affect our further design choices, we should give these questions some special attention, we can also discuss along our call on Wednesday. And I will also post a related question in the R&D Discord (Eth 1.x), coming a bit more from the research angle to complement.

Some more practical note, somewhat related to the topic from above: for this being able to be properly reviewed, we need to come to a commit-change-path with substantially smaller diffs, otherwise this will not really be reviewable with work being done on such a sensitive library like tx. If we decide to share more code this will already ease the situation. But this can also mean to split the PR into two PRs for each EIP. Let's see.

Putting this as a side note here first to not discuss everything at once: I am not yet convinced by this Unsigned/Singed transaction class split. Maybe for another discussion round. 😄

But again: fantastic start, just becoming aware how big of a change (for Ethereum) this actually is. 😎 🤩 🥳

@jochem-brouwer
Copy link
Member Author

Hi @holgerd77 , I actually intend to also come up with an abstract base class for transactions. You are right that these transactions currently share a lot of code so it makes a lot of sense to integrate what these classes have in common in a base class.

I will try to motivate this unsigned/signed stuff during our call.

@holgerd77
Copy link
Member

For reference: ethereum/tests#772 (Access list example test case)

@jochem-brouwer
Copy link
Member Author

Very weird, when I git mv unsignedLegacyTransaction.ts LegacyTransaction.ts git deletes the file and just dumps all the changes in a new file, so doesn't seem to be tracking any progress. It worked OK for EIP2930Transaction 🤔

@jochem-brouwer
Copy link
Member Author

Some notes to self:

Have a lot of byte types all around. Should investigate which are needed and descriptively name those. In particular, it seems that we need the hash of a transaction (this is included in a block header), the serialized transaction (this is included in a block body), the hash of the transaction (should be the hash of the serialized transaction), and the raw hash of the transaction (this is the hash of the transaction, but without the signature fields, possibly also RLP-encoded).

@jochem-brouwer
Copy link
Member Author

Rebased upon master.

Fixed the bug in #1042

@holgerd77
Copy link
Member

Very weird, when I git mv unsignedLegacyTransaction.ts LegacyTransaction.ts git deletes the file and just dumps all the changes in a new file, so doesn't seem to be tracking any progress. It worked OK for EIP2930Transaction 🤔

I think you need to do this in two commits, one with moving location and the other one with just renaming, otherwise git can't follow allow (think I can remember that I had a similar case).

@jochem-brouwer
Copy link
Member Author

Very weird, when I git mv unsignedLegacyTransaction.ts LegacyTransaction.ts git deletes the file and just dumps all the changes in a new file, so doesn't seem to be tracking any progress. It worked OK for EIP2930Transaction 🤔

I think you need to do this in two commits, one with moving location and the other one with just renaming, otherwise git can't follow allow (think I can remember that I had a similar case).

I will try this, but it still does not make much sense; if this would work and then I would squash these two commits together, it should lead to the same result. I also committed changes and git mv'd EIP2930Transaction. I will retry to see if it works. Maybe it has to be a fresh commit.

@jochem-brouwer
Copy link
Member Author

jochem-brouwer commented Jan 29, 2021

EDIT: this is resolved

I am having some problems coming up with a good type for this access list item (see 26b7d34). The actual type is of (Buffer | Buffer[])[][] (and it would be great if we could somehow ensure that the 0-index is Buffer and the 1-index is Buffer[] Just found a fix for this). However this clogs up the package a lot, since now in fromValuesArray we have to explicitly cast these types.

I want to actually enable a second type on TxData, where it is possible to create an array, where each item is of type {address: string, storageKeys: string[]} (this is inspired by the Geth implementation). At runtime, in the constructor, if this is the type, then it will be converted immediately to the Buffer array (since we need this for stuff like hashes, serializing, etc.). It will also be possible to convert a raw tx (so with a Buffer) to this more readable object with some getters (like accessListToJSON).

I am mainly a bit worried that this explicit (Buffer | Buffer[])[][] type is nice in terms of (better) type safety, but it makes the code somewhat less readable. Thoughts? @ryanio @cgewecke @holgerd77

tx: add fromSerializedTx method
tx: remove TODOs

tx: add fromSerializedTx to legacy transaction
@jochem-brouwer
Copy link
Member Author

Have opened a temporary separate branch on #1133 to see if CI passes.

I rebased this one into 31 commits, and mainly removed the entire process here where I first split up the files in signed/unsigned variants (this history is now gone). If I force push here, then it should be ready-to-review on a per-commit basis.

@holgerd77
Copy link
Member

@jochem-brouwer great stuff, thanks for going this extra mile! 😄

@jochem-brouwer
Copy link
Member Author

Ignore this commit, I was on the wrong branch 🤦

tx: move api test to legacy test

tx: lint
@jochem-brouwer jochem-brouwer added the type: test all hardforks This special label enables VM state and blockchain tests for all hardforks on the respective PR. label Mar 2, 2021
@jochem-brouwer jochem-brouwer marked this pull request as ready for review March 2, 2021 20:37
@jochem-brouwer
Copy link
Member Author

Very scary force push 😅

All right, rebased this one, ready for review! 🎉

Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another round of review comments here.

} as PostByzantiumTxReceipt
const statusInfo = txRes.execResult.exceptionError ? 'error' : 'ok'
receiptLog += ` status=${txReceipt.status} (${statusInfo}) (>= Byzantium)`
} as EIP2930Receipt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this bring additional value here to add a new Receipt interface if the fields are all the same like on the PostByzantiumReceipt? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be explicit here, but we can theoretically also use the PostByzantiumReceipt type.

receiptLog += ` status=${txReceipt.status} (${statusInfo}) (>= Byzantium)`
} as EIP2930Receipt

encodedReceipt = Buffer.concat([Buffer.from('01', 'hex'), encode(Object.values(txReceipt))])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the dynamic transaction type here instead of '01' to avoid future bugs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this entire dynamic type thing sounds like a great addition!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we are talking about the same things here, I was just suggesting to replace Buffer.from('01', 'hex') with Buffer.from([tx.transactionType]) (or similar) here. 😄

@@ -91,7 +91,7 @@ export class Block {
const transactions = []
for (const txData of txsData || []) {
transactions.push(
Transaction.fromValuesArray(txData, {
TransactionFactory.fromBlockBodyData(txData, {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking a bit about this breaking part in Block.

So if I get this right the only breaking thing here would be Transaction now being a type in the main constructor, right?

Wouldn't we already back on the compatibility track if we rename this new type Transaction -> TypedTransaction (not sure, or another name, I would think though that this name is justified even if it encompassed the old transaction format, to be discussed) and then LegacyTransaction back to Transaction?

I actually like this in general (we might still want to rename back Transaction -> LegacyTransaction on a general breaking-release-round on all libraries, not sure), since having the type named TypedTransaction more clearly indicates that a function now allows for a typed tx as an input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this, but this still implies we have to use the TransactionFactory in Block, and we also have to use the TypedTransactions as the type for the transactions field in the block body. We can indeed export the LegacyTransaction as a Transaction, this would mean a non-breaking release for Transaction, but due to the type changes on Block they are breaking. (I am not entirely sure if they are completely breaking, since all methods of the old Transaction are available on both EIP2930Transaction and LegacyTransaction, so maybe it is not strictly breaking?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TransactionFactory in Block is not a problem. With a new Block release the Transaction dependency is also bumped, so the new Block version for sure works together with the new Transaction version release. So that is no problem.

Compatibility problems only arise on direct API usage, there, the only thing we have is the constructor:

constructor(
    header?: BlockHeader,
    transactions: Transaction[] = [],
    uncleHeaders: BlockHeader[] = [],
    opts: BlockOptions = {}
  ) 

This is the only part which we need to make sure that it still takes in Transaction objects from the current library version, which I think it should when we have a type EIP2930Transaction | Transaction (we should test this though, this would be an application for @cgewecke using the functionality from #1118.

If there is the possibility we should very much do this. This is so so crucial that we remain backwards-compatible (especially if it's just a case about a simple rename).

Copy link
Contributor

@cgewecke cgewecke Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rebase #1118 against this and open as draft.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! 😄 Can you do the suggested renaming (see above) also along for test purposes? So this would be needed to see if this will work.

Any opinion if we now might want to merge this in here? I would be still a fan, think this would ease follow-up development. Doesn't mean that we take this out of review. We can just do follow-up fixes on future PRs.


get yParity() {
return this.v
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, are these new aliases really worth to be introduced or do they not just complicate working with signatures since it confuses on generic usage if someone gets used to use e.g. yParity instead of v. I would have a tendency to just remove (or - not sure - add everything to BaseTransaction and at some point make v, r, s private further down the road and therefore have theses aliases just for all tx types).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to be consistent with the EIP, it is basically an utility method.


if (this.v && !this.v.eqn(0) && !this.v.eqn(1)) {
throw new Error('The y-parity of the transaction should either be 0 or 1')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to move up this check should then also be in BaseTransaction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific check? LegacyTransaction can have any v value, it depends on the chain id? So if we would move this exact check to BaseTransaction, we break LegacyTransaction.

public isSigned(): boolean {
const { yParity, r, s } = this
return yParity !== undefined && !!r && !!s
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: remove here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can indeed put this one in BaseTransaction.


public getMessageToVerifySignature(): Buffer {
return this.getMessageToSign()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to have this name as a better working alias I think it's cleaner to:

  • replace getMessageToSign with getMessageToVerifySignature in both tx classes (so also the legacy tx), add the implementations there
  • Remove this call-in function here, add a reverse-one in the BaseTransaction (so name getMessageToSign and call into the getMessageToVerifySignature() functions
  • Mark this getMessageToSign function in BaseTransaction as deprecated, add a removal note to the v6 release notes

This should work, right?

try {
return ecrecover(
msgHash,
yParity.toNumber() + 27, // Recover the 27 which was stripped from ecsign
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused here, so is yParity an alias to v or does this have a difference of 27?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EIP states that yParity is either 0 or 1. ecsign internally adds 27 to the output of secp256k1. Thus when we get the signature's v value, we either get 27 or 28, to get the right v value we subtract 27. But in order to use ecrecover (which expects a number >=27) we have to "recover" the original output value from ecsign.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've read this again a bit also in EIP-155. So if I get correctly, both v (in legacy) and yParity (in EIP2930) are referrring to the v value (so are indeed aliases), it is just the case that different v values for the different transaction types are used.

If this is the case (correct me if I am wrong) I would pledge to:

  1. Move
public readonly v?: BN
public readonly r?: BN
public readonly s?: BN

to BaseTransaction.

  1. Drop at least the s and r aliases in EIP2930Transaction, so senderS() and senderR(). If people uses these this will only lead to the situation that they are making their code incompatible to work with a LegacyTransaction, which is unnecessary and rather annoying in doubt.

  2. Eventually also drop the yParity() alias, not sure. Might be somewhat nice to reflect the differences, but at the end same argumentation as in 2.

  3. Move as much generic signature functionality as possible into BaseTransaction

@@ -97,6 +98,23 @@ export default async function runTx(this: VM, opts: RunTxOpts): Promise<RunTxRes
debug('-'.repeat(100))
debug(`tx checkpoint`)

// Is it an Access List transaction?
if (opts.tx.transactionType === 1 && this._common.isActivatedEIP(2929)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this backwards things applies to the VM as well (then tx in RunTxOpts just gets expanded by also allowing the new 2930Transaction) and then we would just need some additional checks in places like here if the opts.tx.transactionType property exists before checking for the value.

@holgerd77
Copy link
Member

Guys, I would want to merge this in here now. I think this is stable enough to do selected further work upon, and this is otherwise blocking too much other work like the other mid-sized outstanding PR like the RPC one from @ryanio. I will also help on further discussion frame to have some blank slate again so that it gets easier again to pick selected aspects to discuss.

Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, further work to be done in subsequent PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants