Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum support #770

Merged
merged 25 commits into from
Sep 12, 2022
Merged

Enum support #770

merged 25 commits into from
Sep 12, 2022

Conversation

Y-Nak
Copy link
Member

@Y-Nak Y-Nak commented Jul 16, 2022

Add enum and match support

Overview

Enum

enum types can be defined using the enum keyword.
Variants can contain any number of types inside them with tuple like syntax.

pub enum MyEnum {
    Unit
    // Can Hold data inside it.
    Tuple(u32, u256, bool)
}

Constructors for each variant are automatically implemented, and constructors look like below.

let unit: MyEnum = MyEnum::Unit
let tup: MyEnum = MyEnum::Tuple(1, 2, false)

Match

In fe, match is a statement, not an expression. But this can be changed.
match syntax is almost the same as the Rust, e.g.,

match my_enum {
    MyEnum::Unit => {
        return 1
    }
    MyEnum::Tuple(a, b, true) => {
        return u256(a) + b
    }
    MyEnum::Tuple(a, b, false) => {
        return u256(a) - b 
    }
}

// can also use a tuple pattern
match (x, b) => {
   (MyEnum::Unit, true) => {
       return 2
   }
   (MyEnum::Tuple(.., true), true) => {
       return 3
   }
   _ => {
       return 0
   }
}

Currently below patterns are allowed,

  • Wildcard(_), which matches all patterns: _
  • Named variable, which matches all patterns and binds the value to make the value usable in the arm. e.g., a, b and c in MyEnum::Tuple(a, b, c)
  • Boolean literal(true and false)
  • Enum variant. e.g., MyEnum::Tuple(a, b, c)
  • Tuple pattern. e.g., (a, b, c)
  • Rest pattern(..), which matches the rest of the pattern. e.g., MyEnum::Tuple(.., true)
  • Or pattern(|). e.g., MyEnum::Unit | MyEnum::Tuple(.., true)

Also fe compiler checks for exhaustiveness and reachability for pattern matching, and will emit an error when all patterns are not covered or unreachable arms are detected.

Exhaustiveness check example

match val {
    MyEnum::Unit => {
        return 0
    }
    
    MyEnum::Tuple(.., false) => {
        return 1
    }
}

error: patterns are not exhaustive
   ┌─ foo.fe:13:9
   │  
13 │ ╭         match val {
14 │ │             MyEnum::Unit => {
15 │ │                 return 0
16 │ │             }
   · │
20 │ │             }
21 │ │         }
   │ ╰─────────^ `MyEnum::Tuple(_, _, true)` not covered

Reachability check example

match val {
    MyEnum::Unit => {
        return 0
    }
    
    MyEnum::Tuple(a, b, _) => {
        return u256(a) + b
    }

    MyEnum::Tuple(a, b, false) => {
        return u256(a) - b
    }
}

error: unreachable pattern 
   ┌─ foo.fe:22:13
   │
22 │             MyEnum::Tuple(a, b, false) => {
   │             ^^^^^^^^^^^^^^^^^^^^^^^^^^ this arm is unreachable

Implementation details

Exhaustiveness and Reachability

The algorithm to check exhaustiveness and reachability is based on Warnings for pattern matching. The implementation is found in analyzer/src/traversal/pattern_analysis. In this paper, there are only three types of patterns defined.

  1. Wildcard
  2. Or
  3. Constructor

In the implementation, SimplifiedPattern represents them, and high-level patterns in fe are properly lowered into it in the simplify_pattern function in the module.

match lowering

match statement should be lowered into MIR, which has a switch instruction but can no longer represent such a high-level control flow. switch instruction represents branching to multiple destinations according to the target(scrutinee) value, but the target value should be a primitive type(integer or boolean). So this lowering can basically be regarded as the decision tree construction. But multiple decision trees can be constructed from the same match statement, so we need to construct a better one. To achieve this, I implemented the method which is described in Compiling Pattern Matching to Good Decision Trees. The implementation can be found in mir/src/lower/pattern_matching.

E.g.,

enum MyEnum {
    Unit
    Tuple(u32, u256, bool)
}

enum MyEnumNested {
    Unit
    Nested(MyEnum)
    
    fn nested(self) -> u256 {
        match self {
            MyEnumNested::Nested(MyEnum::Unit | MyEnum::Tuple(.., false)) => {
                return 1
            }
            
            
            _ => {
                return 0
            }
        }
    }
}

↓ Decision tree construction

dt

↓ MIR construction from the decision tree

mir

To-Do

  • OPTIONAL: Update Spec if applicable
  • Add entry to the release notes (may forgo for trivial changes)
  • Clean up commit history

@Y-Nak Y-Nak force-pushed the enum-support branch 7 times, most recently from ce2d8ea to 0bfa687 Compare July 20, 2022 15:59
@Y-Nak Y-Nak force-pushed the enum-support branch 8 times, most recently from 06b77a2 to 4c3f471 Compare July 28, 2022 12:02
@Y-Nak Y-Nak force-pushed the enum-support branch 8 times, most recently from 3ed56cc to 05309ef Compare August 10, 2022 17:48
Comment on lines +800 to +807
// harness.test_function(
// &mut executor,
// "enum_storage",
// &[uint_token(1), uint_token(2), bool_token(false)],
// Some(&uint_token(100)),
// );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@g-r-a-n-t
When these lines are uncommented, the test fails with the message thread 'features::enum_match' panicked at 'attempt to subtract with overflow', crates/test-utils/src/lib.rs:158:24.
This seems to be a bug related to GasReporter or evm::StackExecutor, but I didn't look into it yet.

Comment on lines +477 to +481
fn select_switch_merge_block(
&self,
cands: &IndexSet<BasicBlockId>,
dests: impl Iterator<Item = BasicBlockId>,
) -> Option<BasicBlockId> {
Copy link
Member Author

@Y-Nak Y-Nak Sep 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This heuristic is necessary to select a switch's exit or merge block because multiple merge block candidates cannot be distinguished only by looking at a MIR CFG. This is because of the existence of a default arm.
Please consider the following example and the corresponding MIR.

pub enum MyEnum {
    Unit
    UnitTuple()
    Tuple(u32, u256)
}

pub enum Tuple2{
    Value(MyEnum, MyEnum)
}


contract Foo {
    pub fn bar() -> u256 {
        let tup: Tuple2 = Tuple2::Value(MyEnum::Tuple(1, 2), MyEnum::UnitTuple())
        let res: u256 = 0
        match tup {
            Tuple2::Value(MyEnum::Unit, MyEnum::Tuple(x, y)) | Tuple2::Value(MyEnum::Tuple(x, y), MyEnum::UnitTuple()) => {
                res = u256(x) + y
            }
            _ => {
                return 0
            }
        }
        
        return res
    }
}

graphviz

Here, the "original" exit block is bb1(or bb5 which can be seen as the exit block because it's the immediate dominator of bb1). But only by looking at the MIR we can't distinguish it from bb7, because both of them are in the union of dominance frontiers of bb3, bb9, and bb12.

NOTE:

  1. This algorithm is deterministic.
  2. This algorithm doesn't change the semantics of the original code even if it fails to find the original exit block. But if it fails to find the original exit block, the generated yul would be much greater than the optimal code.
  3. If there is no return in all arms, this algorithm can certainly find the original exit block.

@Y-Nak Y-Nak mentioned this pull request Sep 4, 2022
3 tasks
@Y-Nak Y-Nak mentioned this pull request Sep 5, 2022
3 tasks
@Y-Nak Y-Nak force-pushed the enum-support branch 4 times, most recently from 787b2e4 to 7722b04 Compare September 6, 2022 20:13
@Y-Nak Y-Nak force-pushed the enum-support branch 2 times, most recently from 92c9010 to 7d5d627 Compare September 6, 2022 21:38
Copy link
Collaborator

@cburgdorf cburgdorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epic work! I just skimmed over the code. A lot of it is way over my head so I didn't do a deep dive. What I looked at looked great 👍 🥷

crates/analyzer/src/namespace/scopes.rs Show resolved Hide resolved
@@ -427,32 +435,34 @@ fn expr_named_thing(
panic!("const type must be fixed size")
}

// Check visibility of constant.
if !id.is_public(context.db()) && id.module(context.db()) != context.module() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is now covered by more generalized code 👍

@@ -0,0 +1,60 @@
# `match` statement
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see the PR covering a spec update, too! ❤️

Copy link
Collaborator

@sbillig sbillig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing, as usual @Y-Nak. Sorry it took me so long to read through!

@sbillig sbillig merged commit 7c0c70d into ethereum:master Sep 12, 2022
@Y-Nak Y-Nak deleted the enum-support branch September 13, 2022 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants