From 2f7183fa43a8e11c75773ddabd45289cdcb7ab99 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Fri, 1 Nov 2019 14:17:37 -0600 Subject: [PATCH 01/10] Rough draft Signed-off-by: Danno Ferrin --- EIPS/eip-validatedEVMContracts.md | 180 ++++++++++++++++++++++++++++++ 1 file changed, 180 insertions(+) create mode 100644 EIPS/eip-validatedEVMContracts.md diff --git a/EIPS/eip-validatedEVMContracts.md b/EIPS/eip-validatedEVMContracts.md new file mode 100644 index 0000000000000..03f34854d96f9 --- /dev/null +++ b/EIPS/eip-validatedEVMContracts.md @@ -0,0 +1,180 @@ +--- +eip: +title: Validated EVM Contracts +author: Danno Ferrin (@shemnon), +discussions-to: +status: Draft +type: Standards Track +category: Core +created: 2019-11-01 +requires: 1702, 2327, +replaces: 1707 (unpublished), 1714 (unpublished), 615 (in part) +--- + +## Simple Summary + +Make minor changes to EVM contract layout and add validation rules to a subset of those contracts. + +## Abstract + +A set of contract markers and validation rules relating to those markers is proposed. These +validation rules enable forwards compatible evolution of EVM contracts and provide some assurances +to Ethereum clients allowing them to disable some runtime verification steps by moving these +validations to the deployment phase. + +## Motivation + +There are two major motivations: first the need to make the EVM easier to evolve, and the second is +to provide validations that allow clients to optimize their EVM execution. + +First there is the issue of an evolvable EVM. With the current state of EVM contracts literally any +sequence of bytes can be deployed to the blockchain. Some tools take advantage of this situation and +add meta-data to the end of their contract deployment. The real impact is that this precludes the +addition of new multi-byte instructions (such as the PUSHn series) because the new instructions +could hide a previously valid `JUMPDEST` when evaluated as a new opcode set. So invalid contracts +will not be deployable. + +Second there is the issue of improving runtime execution. One example is `JUMPDEST` evaluation. +Because each jump must "land" on a jump dest each client needs to validate that the dest is a valid +opcode location. Clients either need to do the analysis and store the values or re-evaluate the +contract on each execution. Stronger deployment validation cal allow clients to presume all jump +calls are valid. + +A tertiary motivation is to prepare the way for easily JITable contracts. While the current EVM can +be JIT compiled there are certain analyses that need to be performed to prevent some pathological or +uncompilable cases from being compiled, or to provide the necessary analysis. With stricter rules +these cases can be detected at deploy time and rejected, allowing EVM clients to make better +assumptions about the contract being compiled. + +## Specification + +There are four interlocking portions, in addition to EIP-1702 (Generalized Account Versioning +Scheme) and EIP-2327 (`BEGINDATA` opcode). They are a versioning header (similar to what was in +EIP-1707), invalid opcode validation, static jump analysis, and same evm limits validation. + +### EVM Account Versioning + +This EIP only applies to contracts stored or in the process of being stored in in accounts with +version `1`. This EIP never applies to contracts stored or in the process of being stored in +accounts at version `0`. Future EIPs may increase the set of contract versions this EIP applies to. + +### Version Header + +For contracts with the first byte from `1` to `255`, or whose total length is less than 4 bytes, the +contract is treated exactly as through it had been deployed to an account with version `0`. For +these contracts none of the other subsections in this EIP apply. + +Whe deploying a contract if a contract starts with `0`, has a length 4 or later, and has a version +that is not recognized by the EVM the contract deployment transaction fails and consumes all +allocated gas. + +For purposes of PC calculations the first byte after the version header is `0`. There is no +mechanism within the EVM to retrieve the header values. + +### `BEGINDATA` + +As described in EIP-2327 a new opcode `BEGINDATA` (0xGG) is added that indicates the remainder of +the contract should not be considered executable code. + +### Invalid Opcode Validation + +All data between the Version Header and the `BEGINDATA` marker, or the end of the contract if +`BEGINDATA` is not present, must represent a valid EVM program at all points of the data. Invalid +opcode validation consists of the following process: + +- Iterate over the code bytes one by one. + - If the code byte is a multi-byte operation, skip the appropriate number of bytes. + - If the code byte is a valid opcode or designated invalid instruction (`0xfe`), continue. + - If the code byte is the `BEGINDATA` operation (`0xb6`) stop iterating. + - Otherwise, throw out-of-gas. + +As of the writing of this spec all of the multibyte operations are the `PUSHn` series of operations +from `0x60` to `0x7f`. Future forks may add more multi-byte operations. + +### Static Jump Validations + +For every jump operation preceded by a `PUSHn` instruction the value of the data pushed on to the +stack by the `PUSHn` operation must point to a valid `JUMPDEST` operation. Clients may combine this +check with invalid opcode validation. + +As of the writing of this spec the jump operations are `JUMP` (0x56) and `JUMPI` (0x57). Future +forks may add more jump operations. + +### ?Sane Limits checks? (1985)? + +> TODO research if any EIP1985 checks make sense. + +## Rationale + +The first major feature is the invalid opcode removal. In the case where a contract has an invalid +opcode that later becomes a multibyte opcode followed by a `JUMPDEST` marker that contract would +become invalid after a hard fork because the destination marker would become part of the new +multibyte instruction . If no invalid opcodes can be created then the +possibility of the `JUMPDEST` being absorbed is eliminated. + +One complication is that current versions of solidity append the swarm hash of the source code of +the contract in some instances to the end of the generated EVM bytecode. That is what motivated the +addition of the `BEGINDATA` opcode. Solidity can add a fairly simple wrapper function to it's +existing EVM generation. + +`JUMPDEST` validation is present to eliminate repeated validation calls for contracts and to reduce +the needed data storage requirements for cached validation. For example, if a client notices a +contract contains only static jumps it could store a cached validation flag that no jump analysis +needs to be performed, alternately they could defer the analysis until the first dynamic jump is +encountered. + +## Backwards Compatibility + +Almost all existing contract deployments will be able to be deployed with no client changes. The one +exception is contract deployments that start with a zero byte. This should have no impact on +existing contract execution because any contract with a zero byte in the first position is not +executable. + +Except for the validation rules and versioning header all other semantics of the EVM are the same. +Gas schedules and opcode tables would be the same between versions and headers. + +## Forwards Compatibility + +This spec provides forward compatibility in at least two ways. + +First, the content of multi byte and jump dest validated opcodes can be increased in future forks. +Contracts that would be valid under new rules would be rejected under old rules, and all older +contracts would still be valid under the new rules. Any newly deployed opcodes would be disabled +unless the code is appropriately validated. + +Second, the versioning header can be extended to allow for stricter validations in future hard forks +while keeping the EVM evaluation semantics the same. Such possible stricter validations could +include prohibiting dynamic jumps. + +## Test Cases + +Incomplete whiteboard list + +- Positive + - no header and invalid opcodes + - including the case where a `JUMPDEST` gets consumed by a proposed multi-byte operation + - no header and all valid opcodes + - includes static jump to invalid destination + - header and all valid opcodes + - includes static jump to valid destination + - header, all valid opcodes, and `BEGINDATA` + - header, all valid opcodes, `BEGINDATA`, and invalid opcodes in data + - three byte program, starts with zero + - four bytes program, header only + - header and begin data only +- Negative + - contract with otherwise valid program that starts with zero, 5 bytes or more + - contract with header and invalid opcodes + - contract with header, begin data, and invalid opcodes in the middle + - contract with header, and static jump to bad place + - contract with header, and 1985 violations (one contract per violation) + - header, and contract code too large + - header, contract code, begin data, data, and the whole thing is too large + +## Implementation + +not done yet + +## Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From cbefa573362752501fc6ad06f754184a636fb4e8 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Fri, 1 Nov 2019 14:47:50 -0600 Subject: [PATCH 02/10] solidity, tests, opcode Signed-off-by: Danno Ferrin --- EIPS/eip-validatedEVMContracts.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/EIPS/eip-validatedEVMContracts.md b/EIPS/eip-validatedEVMContracts.md index 03f34854d96f9..eb6cd6129f661 100644 --- a/EIPS/eip-validatedEVMContracts.md +++ b/EIPS/eip-validatedEVMContracts.md @@ -68,12 +68,15 @@ Whe deploying a contract if a contract starts with `0`, has a length 4 or later, that is not recognized by the EVM the contract deployment transaction fails and consumes all allocated gas. +For this EIP, only header version '1' (contracts starting with the byte stream 0x00 0x00 0x00 0x01) +is defined. Future EIPs may expand on the valid set of headers. + For purposes of PC calculations the first byte after the version header is `0`. There is no mechanism within the EVM to retrieve the header values. ### `BEGINDATA` -As described in EIP-2327 a new opcode `BEGINDATA` (0xGG) is added that indicates the remainder of +As described in EIP-2327 a new opcode `BEGINDATA` (0xb6) is added that indicates the remainder of the contract should not be considered executable code. ### Invalid Opcode Validation @@ -133,6 +136,10 @@ executable. Except for the validation rules and versioning header all other semantics of the EVM are the same. Gas schedules and opcode tables would be the same between versions and headers. +Existing compilers (such as solidity) can provide support for headers by prepending their output +stream with 0x00, 0x00, 0x00, 0x01 and appending in 0xb6 prior to any non-code data added as part of +the contract. + ## Forwards Compatibility This spec provides forward compatibility in at least two ways. @@ -168,6 +175,9 @@ Incomplete whiteboard list - contract with header, begin data, and invalid opcodes in the middle - contract with header, and static jump to bad place - contract with header, and 1985 violations (one contract per violation) + - contract with unrecognized header + - contract with a static jump into code in BEGINDATA + - contract with a static jump outside of all data - header, and contract code too large - header, contract code, begin data, data, and the whole thing is too large From 92229a7a8f2b47d1e6063929a050faec81bb2d03 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Mon, 4 Nov 2019 07:03:32 -0700 Subject: [PATCH 03/10] Updates * remove 1985 references, I don't see it applying * enumerate invalid opcodes * embedded links to othe EIPs * style on backticking numbers and opcodes Signed-off-by: Danno Ferrin --- EIPS/eip-validatedEVMContracts.md | 90 ++++++++++++++++++------------- 1 file changed, 53 insertions(+), 37 deletions(-) diff --git a/EIPS/eip-validatedEVMContracts.md b/EIPS/eip-validatedEVMContracts.md index eb6cd6129f661..16a22e7a16799 100644 --- a/EIPS/eip-validatedEVMContracts.md +++ b/EIPS/eip-validatedEVMContracts.md @@ -1,14 +1,14 @@ --- eip: title: Validated EVM Contracts -author: Danno Ferrin (@shemnon), +author: Danno Ferrin (@shemnon) discussions-to: status: Draft type: Standards Track category: Core -created: 2019-11-01 -requires: 1702, 2327, -replaces: 1707 (unpublished), 1714 (unpublished), 615 (in part) +created: 2019-11-01 +requires: 1702, 2327 +replaces: 615 (in part), 1707 (unpublished, abandoned), 1712 (unpublished) --- ## Simple Summary @@ -30,7 +30,7 @@ to provide validations that allow clients to optimize their EVM execution. First there is the issue of an evolvable EVM. With the current state of EVM contracts literally any sequence of bytes can be deployed to the blockchain. Some tools take advantage of this situation and add meta-data to the end of their contract deployment. The real impact is that this precludes the -addition of new multi-byte instructions (such as the PUSHn series) because the new instructions +addition of new multi-byte instructions (such as the `PUSHn` series) because the new instructions could hide a previously valid `JUMPDEST` when evaluated as a new opcode set. So invalid contracts will not be deployable. @@ -48,9 +48,10 @@ assumptions about the contract being compiled. ## Specification -There are four interlocking portions, in addition to EIP-1702 (Generalized Account Versioning -Scheme) and EIP-2327 (`BEGINDATA` opcode). They are a versioning header (similar to what was in -EIP-1707), invalid opcode validation, static jump analysis, and same evm limits validation. +There are three interlocking portions specified in this EIP, in addition to two other portions from +active EIPs: [EIP-1702] (Generalized Account Versioning Scheme) and [EIP-2327] (`BEGINDATA` opcode). +They are a versioning header (similar to what was in [EIP-1707]), invalid opcode validation (similar +to [EIP-1712]), and static jump analysis. ### EVM Account Versioning @@ -68,16 +69,16 @@ Whe deploying a contract if a contract starts with `0`, has a length 4 or later, that is not recognized by the EVM the contract deployment transaction fails and consumes all allocated gas. -For this EIP, only header version '1' (contracts starting with the byte stream 0x00 0x00 0x00 0x01) -is defined. Future EIPs may expand on the valid set of headers. +For this EIP, only header version `1` (contracts starting with the byte stream `0x00` `0x00` `0x00` +`0x01`) is defined. Future EIPs may expand on the valid set of headers. -For purposes of PC calculations the first byte after the version header is `0`. There is no -mechanism within the EVM to retrieve the header values. +For purposes of Program Counter calculations the first byte after the version header is location +`0`. There is no mechanism within the EVM to retrieve the header values. ### `BEGINDATA` -As described in EIP-2327 a new opcode `BEGINDATA` (0xb6) is added that indicates the remainder of -the contract should not be considered executable code. +As described in [EIP-2327] a new opcode `BEGINDATA` (`0xb6`) is added that indicates the remainder +of the contract should not be considered executable code. ### Invalid Opcode Validation @@ -91,8 +92,14 @@ opcode validation consists of the following process: - If the code byte is the `BEGINDATA` operation (`0xb6`) stop iterating. - Otherwise, throw out-of-gas. -As of the writing of this spec all of the multibyte operations are the `PUSHn` series of operations -from `0x60` to `0x7f`. Future forks may add more multi-byte operations. +As of the Istanbul upgrade all of the multi-byte operations are the `PUSHn` series of operations +from `0x60` to `0x7f`. Future upgrades may add more multi-byte operations. + +As of the Istanbul upgrade the invalid opcodes are `0x0c` to `0x0f`, `0x1e`, `0x1f`, `0x21` to +`0x2f`, `0x46` to `0x4f`, `0x5c` to `0x5f`, `0xa5` to `0xaf`, `0xb3` to `0xef`, `0xf6` to `0xf9`, +`0xfb`, `0xfc`, and `0xfe`. Future upgrades will remove items from this list. Note that `0xb6` is +referenced in this spec as the `BEGINDATA` marker, but is not part of any deployed upgrade. Also +note that `0xfe` would remain as a reserved 'invalid instruction' that will still be permitted. ### Static Jump Validations @@ -100,20 +107,16 @@ For every jump operation preceded by a `PUSHn` instruction the value of the data stack by the `PUSHn` operation must point to a valid `JUMPDEST` operation. Clients may combine this check with invalid opcode validation. -As of the writing of this spec the jump operations are `JUMP` (0x56) and `JUMPI` (0x57). Future -forks may add more jump operations. - -### ?Sane Limits checks? (1985)? - -> TODO research if any EIP1985 checks make sense. +As of the Istanbul upgrade the jump operations are `JUMP` (`0x56`) and `JUMPI` (`0x57`). Future +upgrades may add more jump operations. ## Rationale The first major feature is the invalid opcode removal. In the case where a contract has an invalid -opcode that later becomes a multibyte opcode followed by a `JUMPDEST` marker that contract would -become invalid after a hard fork because the destination marker would become part of the new -multibyte instruction . If no invalid opcodes can be created then the -possibility of the `JUMPDEST` being absorbed is eliminated. +opcode that later becomes a multi-byte opcode followed by a `JUMPDEST` marker that contract would +become invalid after an upgrade because the destination marker would become part of the new +multi-byte instruction, as described in the [EIP-663 discussion]. If no invalid opcodes can be +created then the possibility of the `JUMPDEST` being absorbed is eliminated. One complication is that current versions of solidity append the swarm hash of the source code of the contract in some instances to the end of the generated EVM bytecode. That is what motivated the @@ -130,26 +133,28 @@ encountered. Almost all existing contract deployments will be able to be deployed with no client changes. The one exception is contract deployments that start with a zero byte. This should have no impact on -existing contract execution because any contract with a zero byte in the first position is not -executable. +existing contract execution because any contract with a zero byte in the first position would +immediately halt because `0x00` maps to the `STOP` instruction, the utility and value of those +contracts is minimal at best. If this is not desirable a different header signaling byte (such as +`0xEF`) can be used. Except for the validation rules and versioning header all other semantics of the EVM are the same. Gas schedules and opcode tables would be the same between versions and headers. Existing compilers (such as solidity) can provide support for headers by prepending their output -stream with 0x00, 0x00, 0x00, 0x01 and appending in 0xb6 prior to any non-code data added as part of -the contract. +stream with `0x00`, `0x00`, `0x00`, `0x01` and appending in `0xb6` prior to any non-code data added +as part of the contract. ## Forwards Compatibility This spec provides forward compatibility in at least two ways. -First, the content of multi byte and jump dest validated opcodes can be increased in future forks. -Contracts that would be valid under new rules would be rejected under old rules, and all older -contracts would still be valid under the new rules. Any newly deployed opcodes would be disabled -unless the code is appropriately validated. +First, the content of multi byte and jump dest validated opcodes can be increased in future +upgrades. Contracts that would be valid under new rules would be rejected under old rules, and all +older contracts would still be valid under the new rules. Any newly deployed opcodes would be +disabled unless the code is appropriately validated. -Second, the versioning header can be extended to allow for stricter validations in future hard forks +Second, the versioning header can be extended to allow for stricter validations in future upgrades while keeping the EVM evaluation semantics the same. Such possible stricter validations could include prohibiting dynamic jumps. @@ -157,6 +162,9 @@ include prohibiting dynamic jumps. Incomplete whiteboard list +Each test would need to be written 3 times, once for normal contract deployment, once for `CREATE`, +and once again for `CREATE2`. + - Positive - no header and invalid opcodes - including the case where a `JUMPDEST` gets consumed by a proposed multi-byte operation @@ -176,15 +184,23 @@ Incomplete whiteboard list - contract with header, and static jump to bad place - contract with header, and 1985 violations (one contract per violation) - contract with unrecognized header - - contract with a static jump into code in BEGINDATA + - contract with a static jump into code in `BEGINDATA` - contract with a static jump outside of all data - header, and contract code too large - header, contract code, begin data, data, and the whole thing is too large + - One test for each invalid opcode. Plain, with header, and with header and `BEGINDATA` ## Implementation -not done yet +No implementation yet. ## Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). + +[eip-615]: https://eips.ethereum.org/EIPS/eip-615 +[eip-1702]: https://eips.ethereum.org/EIPS/eip-1702 +[eip-1707]: https://github.com/ethereum/EIPs/pull/1707 +[eip-1712]: https://github.com/ethereum/EIPs/pull/1712 +[eip-2327]: https://github.com/ethereum/EIPs/pull/2327 +[eip-663 discussion]: https://ethereum-magicians.org/t/eip-663-unlimited-swap-and-dup-instructions/3346/11?u=shemnon From 3cf50f0f328a56deafc9e338e2bca53b7e43b404 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Mon, 4 Nov 2019 08:57:21 -0700 Subject: [PATCH 04/10] Editing pass Signed-off-by: Danno Ferrin --- EIPS/eip-validatedEVMContracts.md | 106 +++++++++++++++++------------- 1 file changed, 62 insertions(+), 44 deletions(-) diff --git a/EIPS/eip-validatedEVMContracts.md b/EIPS/eip-validatedEVMContracts.md index 16a22e7a16799..2c85602304c83 100644 --- a/EIPS/eip-validatedEVMContracts.md +++ b/EIPS/eip-validatedEVMContracts.md @@ -31,49 +31,59 @@ First there is the issue of an evolvable EVM. With the current state of EVM cont sequence of bytes can be deployed to the blockchain. Some tools take advantage of this situation and add meta-data to the end of their contract deployment. The real impact is that this precludes the addition of new multi-byte instructions (such as the `PUSHn` series) because the new instructions -could hide a previously valid `JUMPDEST` when evaluated as a new opcode set. So invalid contracts -will not be deployable. +could hide a previously valid `JUMPDEST` when evaluated as a new opcode set. To prevent this account +versioning will be used so that contracts can be deployed in a way that is demonstrably validated. Second there is the issue of improving runtime execution. One example is `JUMPDEST` evaluation. Because each jump must "land" on a jump dest each client needs to validate that the dest is a valid opcode location. Clients either need to do the analysis and store the values or re-evaluate the -contract on each execution. Stronger deployment validation cal allow clients to presume all jump -calls are valid. +contract on each execution. Stronger deployment validation will allow clients to presume jump calls +are valid in certain circumstances. A tertiary motivation is to prepare the way for easily JITable contracts. While the current EVM can -be JIT compiled there are certain analyses that need to be performed to prevent some pathological or -uncompilable cases from being compiled, or to provide the necessary analysis. With stricter rules -these cases can be detected at deploy time and rejected, allowing EVM clients to make better -assumptions about the contract being compiled. +be JIT compiled there are certain analyses that need to be performed to prevent or accommodate some +pathological or uncompilable cases from being compiled. With stricter rules these cases can be +detected at deploy time and rejected allowing EVM clients to make better assumptions about the +contract being compiled. ## Specification -There are three interlocking portions specified in this EIP, in addition to two other portions from -active EIPs: [EIP-1702] (Generalized Account Versioning Scheme) and [EIP-2327] (`BEGINDATA` opcode). -They are a versioning header (similar to what was in [EIP-1707]), invalid opcode validation (similar -to [EIP-1712]), and static jump analysis. +There are three interlocking portions specified in this EIP and two portions from other active EIPs +included in this validation. [EIP-1702] (Generalized Account Versioning Scheme) and [EIP-2327] +(`BEGINDATA` opcode) are specified in their published locations. The portions specified in this EIP +are a versioning header (similar to what was in [EIP-1707]), invalid opcode validation (similar to +[EIP-1712]), and static jump analysis. ### EVM Account Versioning -This EIP only applies to contracts stored or in the process of being stored in in accounts with -version `1`. This EIP never applies to contracts stored or in the process of being stored in +Starting at `BLOCKNUM` (TBD) `EIP-1702` will be activated, `LATEST_VERSION` will be set to `1`, and +all new and updated accounts will have the account version `1`. The validation phase will apply the +rules described in [Version Header](version-header), [`BEGINDATA`](beindata), +[Invalid Opcode Validation](invalid-opcode-validation), and +[Static Jump Validations](static-jump-validations). + +These EIP sections only apply to contracts stored or in the process of being stored in in accounts +with version `1`. This EIP never applies to contracts stored or in the process of being stored in accounts at version `0`. Future EIPs may increase the set of contract versions this EIP applies to. ### Version Header -For contracts with the first byte from `1` to `255`, or whose total length is less than 4 bytes, the -contract is treated exactly as through it had been deployed to an account with version `0`. For +For contracts with the first byte from `0x01` to `0xff`, or whose total length is less than 4 bytes, +the contract is treated exactly as through it had been deployed to an account with version `0`. For these contracts none of the other subsections in this EIP apply. -Whe deploying a contract if a contract starts with `0`, has a length 4 or later, and has a version -that is not recognized by the EVM the contract deployment transaction fails and consumes all -allocated gas. +Whe deploying a contract if a contract starts with `0x00` and has a length 4 or later the first four +bytes form a version header. If a version header is not recognized by the EVM the contract +deployment transaction fails with out-of-gas. For this EIP, only header version `1` (contracts starting with the byte stream `0x00` `0x00` `0x00` -`0x01`) is defined. Future EIPs may expand on the valid set of headers. +`0x01`) is defined. Future EIPs may expand on the valid set of headers. This version indicates that +next three validations are applied to the content of the contract, keeping all other semantics of +the current "version 0" EVM contracts, including the same gas schedule. -For purposes of Program Counter calculations the first byte after the version header is location -`0`. There is no mechanism within the EVM to retrieve the header values. +For purposes of EVM Program Counter calculations the first byte after the version header is location +`0`. The contract header is not part of the accessible contract data. There is no mechanism within +the EVM to retrieve the header values. ### `BEGINDATA` @@ -82,8 +92,8 @@ of the contract should not be considered executable code. ### Invalid Opcode Validation -All data between the Version Header and the `BEGINDATA` marker, or the end of the contract if -`BEGINDATA` is not present, must represent a valid EVM program at all points of the data. Invalid +All data between the Version Header and either the `BEGINDATA` marker or the end of the contract if +`BEGINDATA` is not present must represent a valid EVM program at all points of the data. Invalid opcode validation consists of the following process: - Iterate over the code bytes one by one. @@ -104,24 +114,29 @@ note that `0xfe` would remain as a reserved 'invalid instruction' that will stil ### Static Jump Validations For every jump operation preceded by a `PUSHn` instruction the value of the data pushed on to the -stack by the `PUSHn` operation must point to a valid `JUMPDEST` operation. Clients may combine this -check with invalid opcode validation. +stack by the `PUSHn` operation must point to a valid `JUMPDEST` operation. If this validation fails +then the contract creation fails with out-of-gas. As of the Istanbul upgrade the jump operations are `JUMP` (`0x56`) and `JUMPI` (`0x57`). Future upgrades may add more jump operations. +As a client optimization this check may be performed during invalid opcode validation, or it may be +performed separately at contract deployment time. + ## Rationale The first major feature is the invalid opcode removal. In the case where a contract has an invalid opcode that later becomes a multi-byte opcode followed by a `JUMPDEST` marker that contract would become invalid after an upgrade because the destination marker would become part of the new multi-byte instruction, as described in the [EIP-663 discussion]. If no invalid opcodes can be -created then the possibility of the `JUMPDEST` being absorbed is eliminated. +deployed then the possibility of the `JUMPDEST` being absorbed by new multi-byte instructions is +eliminated. One complication is that current versions of solidity append the swarm hash of the source code of the contract in some instances to the end of the generated EVM bytecode. That is what motivated the addition of the `BEGINDATA` opcode. Solidity can add a fairly simple wrapper function to it's -existing EVM generation. +existing EVM generation. This option was chosen for its simplicity over other options such as +encoding the data in uncalled `PUSNn` instructions. `JUMPDEST` validation is present to eliminate repeated validation calls for contracts and to reduce the needed data storage requirements for cached validation. For example, if a client notices a @@ -132,18 +147,21 @@ encountered. ## Backwards Compatibility Almost all existing contract deployments will be able to be deployed with no client changes. The one -exception is contract deployments that start with a zero byte. This should have no impact on -existing contract execution because any contract with a zero byte in the first position would -immediately halt because `0x00` maps to the `STOP` instruction, the utility and value of those -contracts is minimal at best. If this is not desirable a different header signaling byte (such as -`0xEF`) can be used. +exception is contract deployments that start with `0x00`. This should have no impact on existing +contract execution because any contract with a `0x00` in the first position would immediately halt +because `0x00` maps to the `STOP` instruction, the utility and value of those contracts is minimal +at best. If this is not desirable a different header signaling byte that does not map to an existing +opcode (such as `0xEF`) can be used. Except for the validation rules and versioning header all other semantics of the EVM are the same. -Gas schedules and opcode tables would be the same between versions and headers. +Gas schedules and opcode tables would be the same between account versions and wether or not the +contract was deployed with headers. Future EIPs may add opcodes that are only valid with a contract +that is deployed with a version header. Because of the version header validation rules multi-byte +contracts can be deployed. Existing compilers (such as solidity) can provide support for headers by prepending their output -stream with `0x00`, `0x00`, `0x00`, `0x01` and appending in `0xb6` prior to any non-code data added -as part of the contract. +stream with `0x00`, `0x00`, `0x00`, `0x01` and appending `0xb6` prior to any non-code data added as +part of the contract. ## Forwards Compatibility @@ -160,10 +178,9 @@ include prohibiting dynamic jumps. ## Test Cases -Incomplete whiteboard list - -Each test would need to be written 3 times, once for normal contract deployment, once for `CREATE`, -and once again for `CREATE2`. +This is an incomplete list, but provides insight as to the scope of the required testing. Each test +would need to be written 3 times, once for normal contract deployment, once for `CREATE`, and once +again for `CREATE2`. - Positive - no header and invalid opcodes @@ -182,13 +199,13 @@ and once again for `CREATE2`. - contract with header and invalid opcodes - contract with header, begin data, and invalid opcodes in the middle - contract with header, and static jump to bad place - - contract with header, and 1985 violations (one contract per violation) - contract with unrecognized header - contract with a static jump into code in `BEGINDATA` - contract with a static jump outside of all data - - header, and contract code too large + - header, and contract code+header to large by less than 4 bytes + - header, and contract code+header to large by more than 4 bytes - header, contract code, begin data, data, and the whole thing is too large - - One test for each invalid opcode. Plain, with header, and with header and `BEGINDATA` + - One test for each invalid opcode: no header, with header, and with header and `BEGINDATA` ## Implementation @@ -203,4 +220,5 @@ Copyright and related rights waived via [CC0](https://creativecommons.org/public [eip-1707]: https://github.com/ethereum/EIPs/pull/1707 [eip-1712]: https://github.com/ethereum/EIPs/pull/1712 [eip-2327]: https://github.com/ethereum/EIPs/pull/2327 -[eip-663 discussion]: https://ethereum-magicians.org/t/eip-663-unlimited-swap-and-dup-instructions/3346/11?u=shemnon +[eip-663 discussion]: + https://ethereum-magicians.org/t/eip-663-unlimited-swap-and-dup-instructions/3346/11?u=shemnon From 9897c8975445b9230e8a950f64d53d761b956d09 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Mon, 4 Nov 2019 09:07:46 -0700 Subject: [PATCH 05/10] Add EIP# and discussion link Signed-off-by: Danno Ferrin --- EIPS/eip-validatedEVMContracts.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/EIPS/eip-validatedEVMContracts.md b/EIPS/eip-validatedEVMContracts.md index 2c85602304c83..da1e054d4f9da 100644 --- a/EIPS/eip-validatedEVMContracts.md +++ b/EIPS/eip-validatedEVMContracts.md @@ -1,8 +1,8 @@ --- -eip: +eip: 2348 title: Validated EVM Contracts author: Danno Ferrin (@shemnon) -discussions-to: +discussions-to: https://ethereum-magicians.org/t/eip-2348-validated-evm-contracts/3756 status: Draft type: Standards Track category: Core From fb6031d495df69bec4e8f8b37589eaa34b915922 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Mon, 4 Nov 2019 09:46:45 -0700 Subject: [PATCH 06/10] move file Signed-off-by: Danno Ferrin --- EIPS/{eip-validatedEVMContracts.md => eip-2348.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename EIPS/{eip-validatedEVMContracts.md => eip-2348.md} (100%) diff --git a/EIPS/eip-validatedEVMContracts.md b/EIPS/eip-2348.md similarity index 100% rename from EIPS/eip-validatedEVMContracts.md rename to EIPS/eip-2348.md From 2594fc32b660112d99606423544134133c89dce4 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Mon, 4 Nov 2019 10:09:08 -0700 Subject: [PATCH 07/10] Fix travis errors Signed-off-by: Danno Ferrin --- EIPS/eip-2348.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/EIPS/eip-2348.md b/EIPS/eip-2348.md index da1e054d4f9da..37f8b328391fb 100644 --- a/EIPS/eip-2348.md +++ b/EIPS/eip-2348.md @@ -7,9 +7,14 @@ status: Draft type: Standards Track category: Core created: 2019-11-01 +requires: 1702 +--- + + ## Simple Summary @@ -58,9 +63,8 @@ are a versioning header (similar to what was in [EIP-1707]), invalid opcode vali Starting at `BLOCKNUM` (TBD) `EIP-1702` will be activated, `LATEST_VERSION` will be set to `1`, and all new and updated accounts will have the account version `1`. The validation phase will apply the -rules described in [Version Header](version-header), [`BEGINDATA`](beindata), -[Invalid Opcode Validation](invalid-opcode-validation), and -[Static Jump Validations](static-jump-validations). +rules described in the Version Header, `BEGINDATA`, Invalid Opcode Validation, and Static Jump +Validations sections. These EIP sections only apply to contracts stored or in the process of being stored in in accounts with version `1`. This EIP never applies to contracts stored or in the process of being stored in @@ -72,8 +76,8 @@ For contracts with the first byte from `0x01` to `0xff`, or whose total length i the contract is treated exactly as through it had been deployed to an account with version `0`. For these contracts none of the other subsections in this EIP apply. -Whe deploying a contract if a contract starts with `0x00` and has a length 4 or later the first four -bytes form a version header. If a version header is not recognized by the EVM the contract +When deploying a contract if a contract starts with `0x00` and has a length 4 or later the first +four bytes form a version header. If a version header is not recognized by the EVM the contract deployment transaction fails with out-of-gas. For this EIP, only header version `1` (contracts starting with the byte stream `0x00` `0x00` `0x00` @@ -154,7 +158,7 @@ at best. If this is not desirable a different header signaling byte that does no opcode (such as `0xEF`) can be used. Except for the validation rules and versioning header all other semantics of the EVM are the same. -Gas schedules and opcode tables would be the same between account versions and wether or not the +Gas schedules and opcode tables would be the same between account versions and whether or not the contract was deployed with headers. Future EIPs may add opcodes that are only valid with a contract that is deployed with a version header. Because of the version header validation rules multi-byte contracts can be deployed. From 4b2aecf534432f4675acd6cfee408c8072fb5393 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Tue, 5 Nov 2019 08:44:01 -0700 Subject: [PATCH 08/10] update requires Signed-off-by: Danno Ferrin --- EIPS/eip-2348.md | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/EIPS/eip-2348.md b/EIPS/eip-2348.md index 37f8b328391fb..991c7b27a04c3 100644 --- a/EIPS/eip-2348.md +++ b/EIPS/eip-2348.md @@ -7,14 +7,8 @@ status: Draft type: Standards Track category: Core created: 2019-11-01 -requires: 1702 ---- - - +--- ## Simple Summary From d8451c68a17f63b3dfdc8d7164e0495cb2db9516 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Tue, 5 Nov 2019 08:49:17 -0700 Subject: [PATCH 09/10] nodes about PC and header bytes Signed-off-by: Danno Ferrin --- EIPS/eip-2348.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/EIPS/eip-2348.md b/EIPS/eip-2348.md index 991c7b27a04c3..9b1231ac50b88 100644 --- a/EIPS/eip-2348.md +++ b/EIPS/eip-2348.md @@ -79,9 +79,15 @@ For this EIP, only header version `1` (contracts starting with the byte stream ` next three validations are applied to the content of the contract, keeping all other semantics of the current "version 0" EVM contracts, including the same gas schedule. -For purposes of EVM Program Counter calculations the first byte after the version header is location -`0`. The contract header is not part of the accessible contract data. There is no mechanism within -the EVM to retrieve the header values. +**unresolved** - How do we deal with executing with the header? + +- Should contract execution start at index 4 as PC=0, + - This causes EXTCODECOPY indexes to not match up +- should contract execution start at index 4 as PC=4, + - This would require some possibly non-trivial EVM changes +- should the version header be a multi-byte instruction which is a no-op? Contract starts at 0 and + PC=0 + - This introduces a new opcode, may be the simplest. ### `BEGINDATA` From b96f47a2f16c9fd92f126eabc25e26a4d3ffa5f9 Mon Sep 17 00:00:00 2001 From: Danno Ferrin Date: Tue, 21 Jan 2020 09:26:34 -0700 Subject: [PATCH 10/10] Update spec- Add contract code size limit for code segment- Settle on `0xefevm` as header bytesSigned-off-by: Danno Ferrin Signed-off-by: Danno Ferrin --- EIPS/eip-2348.md | 103 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 74 insertions(+), 29 deletions(-) diff --git a/EIPS/eip-2348.md b/EIPS/eip-2348.md index 9b1231ac50b88..8837c763055db 100644 --- a/EIPS/eip-2348.md +++ b/EIPS/eip-2348.md @@ -60,51 +60,82 @@ all new and updated accounts will have the account version `1`. The validation p rules described in the Version Header, `BEGINDATA`, Invalid Opcode Validation, and Static Jump Validations sections. -These EIP sections only apply to contracts stored or in the process of being stored in in accounts -with version `1`. This EIP never applies to contracts stored or in the process of being stored in -accounts at version `0`. Future EIPs may increase the set of contract versions this EIP applies to. +These EIP sections applies to contracts stored or in the process of being stored in in accounts with +version `1`. This EIP never applies to contracts stored or in the process of being stored in +accounts at version `0`. For initcode being executed for `CREATE` and `CREATE2` operations this +applies if the contract invoking the opcode is version `1`. If the calling contract was stored in an +account with version `0` this EIP does not apply. + +Future EIPs may increase the set of contract versions this EIP applies to. ### Version Header -For contracts with the first byte from `0x01` to `0xff`, or whose total length is less than 4 bytes, -the contract is treated exactly as through it had been deployed to an account with version `0`. For +For contracts with the first byte is not `0xef`, or whose total length is less than 4 bytes, the +contract is treated exactly as through it had been deployed to an account with version `0`. For these contracts none of the other subsections in this EIP apply. -When deploying a contract if a contract starts with `0x00` and has a length 4 or later the first +When deploying a contract if a contract starts with `0xef` and has a length 4 or later the first four bytes form a version header. If a version header is not recognized by the EVM the contract deployment transaction fails with out-of-gas. -For this EIP, only header version `1` (contracts starting with the byte stream `0x00` `0x00` `0x00` -`0x01`) is defined. Future EIPs may expand on the valid set of headers. This version indicates that -next three validations are applied to the content of the contract, keeping all other semantics of -the current "version 0" EVM contracts, including the same gas schedule. +When executing a contract with a header the execution should start at `PC=4`, corresponding to the +first byte of the contract that is not part of the headers. + +EVM implementations could model this as a 4 byte no-op no-gas operation that can only occur at the +zeroth index of a contract. However they would need to take care that the byte `0xef` would be +invalid if it occurred in the code segment at any location other than the zeroth byte. -**unresolved** - How do we deal with executing with the header? +For this EIP the header byte sequence [`0xef`, `0x65`, `0x76`, `0x6d`] is defined (corresponding to +the ISO/IEC 8859 part 1 string `'ïevm'`) is specified. This version indicates that next set of +validations are applied to the content of the contract, keeping all other semantics of the current +"version 0" EVM contracts, including the same gas schedule. -- Should contract execution start at index 4 as PC=0, - - This causes EXTCODECOPY indexes to not match up -- should contract execution start at index 4 as PC=4, - - This would require some possibly non-trivial EVM changes -- should the version header be a multi-byte instruction which is a no-op? Contract starts at 0 and - PC=0 - - This introduces a new opcode, may be the simplest. +Future EIPs may expand on the valid set of headers. No other header sequences are defined in this +EIP. -### `BEGINDATA` +### `BEGINDATA` operation As described in [EIP-2327] a new opcode `BEGINDATA` (`0xb6`) is added that indicates the remainder of the contract should not be considered executable code. +If the EVM attempts to execute the `BEGINDATA` operation it should be treated as attempting to +execute an invalid operation. Similarly jumping into any location after the `BEGINDATA` operation is +an invalid operation, even if the byte jumped to corresponds to the `JUMPDEST` opcode. + +### Code Segment Size Limit + +With the introduction of the `BEGINDATA` opcode the contract can now be conceptually split into a +code segment ad a data segment. The code segments corresponds to all the bytes prior to and +including the `BEGINDATA` opcode or the entire contract if no `BEGINDATA` opcode is present. All +other data after the code segment is referred to as the data segment. If there is no `BEGINDATA` +operation there are no bytes in the data segment. + +In [EIP 170](https://eips.ethereum.org/EIPS/eip-170) a contract code size limit was introduced. All +code segment data, including the header bytes and `BEGINDATA` operation (if present) must be equal +to or less than the chain's specified contract code size limit, which is currently 24KiB for +mainnet. + +For contract creation transactions, and the return of `CREATE`, and `CREATE2` operations this limit +is already enforced for the entire size of the contract, including code and data segments. For the +initialization code for a `CREATE` or `CREATE2` operation there is no specified limit, so the +separate enforcement of the code segment length will need to be enforced in those instances. The +combined code and data segment size for init code in `CREATE` and `CREATE2` operations is out of +scope for this EIP. + ### Invalid Opcode Validation All data between the Version Header and either the `BEGINDATA` marker or the end of the contract if `BEGINDATA` is not present must represent a valid EVM program at all points of the data. Invalid opcode validation consists of the following process: -- Iterate over the code bytes one by one. - - If the code byte is a multi-byte operation, skip the appropriate number of bytes. - - If the code byte is a valid opcode or designated invalid instruction (`0xfe`), continue. - - If the code byte is the `BEGINDATA` operation (`0xb6`) stop iterating. - - Otherwise, throw out-of-gas. +- Iterate over the code bytes starting after the header bytes one by one. + - If the code byte is a multi-byte operation, skip the appropriate number of bytes and continue. + - If the code byte is a valid opcode or the designated invalid instruction (`0xfe`), continue. + - If the code byte is the `BEGINDATA` operation (`0xb6`) stop iterating and consider the contract + valid. + - If more bytes than the contract code size limit would be validated the contract is invalid and + the operation fails. + - Otherwise, the contract is invalid and the operation fails. As of the Istanbul upgrade all of the multi-byte operations are the `PUSHn` series of operations from `0x60` to `0x7f`. Future upgrades may add more multi-byte operations. @@ -129,9 +160,17 @@ performed separately at contract deployment time. ## Rationale -The first major feature is the invalid opcode removal. In the case where a contract has an invalid -opcode that later becomes a multi-byte opcode followed by a `JUMPDEST` marker that contract would -become invalid after an upgrade because the destination marker would become part of the new +The choice for the first byte of the header as `0xef` was first recommended in +[issue 154](https://github.com/ethereum/EIPs/issues/154) of the EIP repository. It also maps to an +unused opcode in the version 0 spec and packs next to the `0xf0` series of call instructions, and +the `evm` part was to mirror what WASM has done. Choosing `0x00` as the first byte as it could be +confused with a nonsensical, but correct contract that starts with STOP and the next operation is +PUSH5 if lowercase e was selected, or `STOP` `GASLIMIT` `JUMP` `` if capital letters +were used. A header that was always invalid in the prior EVM specs was seen as desirable. + +The first major validation is the invalid opcode removal. In the case where a contract has an +invalid opcode that later becomes a multi-byte opcode followed by a `JUMPDEST` marker that contract +would become invalid after an upgrade because the destination marker would become part of the new multi-byte instruction, as described in the [EIP-663 discussion]. If no invalid opcodes can be deployed then the possibility of the `JUMPDEST` being absorbed by new multi-byte instructions is eliminated. @@ -164,7 +203,7 @@ that is deployed with a version header. Because of the version header validation contracts can be deployed. Existing compilers (such as solidity) can provide support for headers by prepending their output -stream with `0x00`, `0x00`, `0x00`, `0x01` and appending `0xb6` prior to any non-code data added as +stream with `0xef`, `0x65`, `0x76`, `0x6d` and appending `0xb6` prior to any non-code data added as part of the contract. ## Forwards Compatibility @@ -198,6 +237,8 @@ again for `CREATE2`. - three byte program, starts with zero - four bytes program, header only - header and begin data only + - validated code in `CREATE` an `CREATE2` init code with proper code segment size and total size + greater than the code segment limit - Negative - contract with otherwise valid program that starts with zero, 5 bytes or more - contract with header and invalid opcodes @@ -209,7 +250,11 @@ again for `CREATE2`. - header, and contract code+header to large by less than 4 bytes - header, and contract code+header to large by more than 4 bytes - header, contract code, begin data, data, and the whole thing is too large - - One test for each invalid opcode: no header, with header, and with header and `BEGINDATA` + - one test for each invalid opcode: no header, with header, and with header and `BEGINDATA` + - code segment size violations + - In a contract creation transaction + - In `CREATE` and `CREATE2` init code + - In `CREATE` and `CREATE2` created contracts ## Implementation