Cranelift CLIF Fuzzer generate blocks and branches #3094

afonso360 · 2021-07-18T11:36:28Z

Hey,

Continuing the implementation of the CLIF Fuzzer (tracked in #3050), this PR implements block generation and generating basic branch instructions (jump, brnz, brz, br_icmp).

In this implementation we generate all blocks and signatures up front, and then while generating instructions pick random target blocks to jump to.

This PR is based on top of #3062 which should be merged soon, but I think we can get some issues sorted out while waiting for that to be merged.

afonso360 · 2021-07-18T11:36:37Z

This currently quickly crashes in the verify fuzzer with a out of memory error.

This crash is reproducible with:

 echo "JyEEAf8B/ydxAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACcAAAAAQgAAAAAAAAAAAAAAvARAAAIAAAClAAAnAAA=" | base64 -d > fuzz-input
 cargo fuzz run cranelift-fuzzgen-verify ./fuzz-input

Here's what the FunctionBuilder looks like before calling seal_all_blocks (which is where this crashes):

function u0:0(i32, i64, i8, b1, i8) system_v {
block0(v0: i32, v1: i64, v2: i8, v3: b1, v4: i8):
    br_icmp eq v2, v2, block1(v3, v3, v3, v3, v3, v3, v3, v3, v3, v3, v3)
    jump block1(v3, v3, v3, v3, v3, v3, v3, v3, v3, v3, v3)

block1(v5: b1, v6: b1, v7: b1, v8: b1, v9: b1, v10: b1, v11: b1, v12: b1, v13: b1, v14: b1, v15: b1, v16: i8):
    br_icmp eq v16, v16, block1(v15, v15, v15, v15, v15, v15, v15, v15, v15, v15, v15)
    jump block1(v15, v15, v15, v15, v15, v15, v15, v15, v15, v15, v15)

block2(v17: i8):
    br_icmp eq v17, v17, block5
    jump block2

block3(v18: b1, v19: i8):
    br_icmp eq v19, v19, block1(v18, v18, v18, v18, v18, v18, v18, v18, v18, v18, v18)
    jump block1(v18, v18, v18, v18, v18, v18, v18, v18, v18, v18, v18)

block4(v20: b1, v21: i8):
    br_icmp ugt v21, v21, block1(v20, v20, v20, v20, v20, v20, v20, v20, v20, v20, v20)
    jump block3

block5(v22: b1, v23: i8, v24: i8):
    br_icmp ule v23, v24, block1(v22, v22, v22, v22, v22, v22, v22, v22, v22, v22, v22)
    jump block1(v22, v22, v22, v22, v22, v22, v22, v22, v22, v22, v22)
}

This input doesn't look unreasonable to me. None of the blocks have a return, but do we need to guarantee a return is present?

I just noticed now that 2 of the jump's provide no args to their target blocks, but we definitely pass them in in the instruction inserter. Is this because we picked a non local variable?

Here is the output of cargo-fuzz:

     Running `/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify -artifact_prefix=/mnt/c/Users/Afonso/CLionProjects/wasmtime/fuzz/artifacts/cranelift-fuzzgen-verify/ ar
tifacts/cranelift-fuzzgen-verify/oom-a8aef5b7b1e8d020aae51e6d3dedc9418b04dd99`
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1365534722
INFO: Loaded 1 modules   (221787 inline 8-bit counters): 221787 [0x56195415718a, 0x56195418d3e5),
INFO: Loaded 1 PC tables (221787 PCs): 221787 [0x56195418d3e8,0x5619544ef998),
/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify: Running 1 inputs 1 time(s) each.
Running: artifacts/cranelift-fuzzgen-verify/oom-a8aef5b7b1e8d020aae51e6d3dedc9418b04dd99
==2822== ERROR: libFuzzer: out-of-memory (used: 2093Mb; limit: 2048Mb)
   To change the out-of-memory limit use -rss_limit_mb=<N>

Live Heap Allocations: 2037467893 bytes in 54 chunks; quarantined: 32 bytes in 1 chunks; 8736 other chunks; total chunks: 8791; showing top 95% (at most 8 unique contexts)
2013265920 byte(s) (98%) in 1 allocation(s)
    #0 0x5619523a36a9 in realloc /rustc/llvm/src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x56195247a6e4 in alloc::raw_vec::finish_grow::h45ef09ff8efa3011 (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x8aa6e4)
    #2 0x5619522ca86f in alloc::raw_vec::RawVec$LT$T$C$A$GT$::reserve::do_reserve_and_handle::h10ad7ae98da66681 (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x6f
a86f)
    #3 0x5619524aae90 in cranelift_frontend::ssa::SSABuilder::use_var_nonlocal::hfb5d4209dbff3efc (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x8dae90)
    #4 0x5619524b0cba in cranelift_frontend::ssa::SSABuilder::run_state_machine::h4be05d0025ad2762 (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x8e0cba)
    #5 0x5619524ac4dd in cranelift_frontend::ssa::SSABuilder::seal_one_block::h0168f5ef0239babb (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x8dc4dd)
    #6 0x56195249e736 in cranelift_frontend::frontend::FunctionBuilder::seal_all_blocks::h80281c0fe53cdd5a (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x8ce736)
    #7 0x56195244447c in cranelift_fuzzgen::function_generator::FunctionGenerator::generate::h960858ee4bfd5fd6 (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x874
47c)
    #8 0x5619524451be in _$LT$cranelift_fuzzgen..TestCase$u20$as$u20$arbitrary..Arbitrary$GT$::arbitrary::h8ffee5eb12788765 (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen
-verify+0x8751be)
    #9 0x5619523ec5c7 in rust_fuzzer_test_input (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x81c5c7)
    #10 0x561953b286d0 in __rust_try (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x1f586d0)
    #11 0x561953b27e5f in LLVMFuzzerTestOneInput (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x1f57e5f)
    #12 0x561953b2b5fc in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x1f5b5fc)
    #13 0x561953b3ba29 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x1f6ba29)
    #14 0x561953b45222 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x1f7522
2)
    #15 0x561952327786 in main (/mnt/c/Users/Afonso/CLionProjects/wasmtime/target/x86_64-unknown-linux-gnu/release/cranelift-fuzzgen-verify+0x757786)
    #16 0x7f7187be50b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16

SUMMARY: libFuzzer: out-of-memory
────────────────────────────────────────────────────────────────────────────────

Error: Fuzz target exited with exit status: 71

github-actions · 2021-07-18T11:56:26Z

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "fuzzing"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

cfallin · 2021-07-26T21:37:26Z

but do we need to guarantee a return is present?

Nope, a function that does not terminate (loops infinitely) is legal CLIF, though of course that should hit the timeout mechanism you've added to the interpreter.

Live Heap Allocations: 2037467893 bytes in 54 chunks; quarantined: 32 bytes in 1 chunks; 8736 other chunks; total chunks: 8791; showing top 95% (at most 8 unique contexts)

This looks like uncontrolled Vec growth in the SSA-builder frontend; either the predecessor-block mechanism or the variable tracking. Possibly the way in which blocks are sealed is incorrect, though I haven't looked closely at the details here yet. Are you able to dig a bit more and e.g. add some tracing logging to the algorithm to see what's going wrong?

afonso360 · 2021-08-01T08:24:42Z

Hey,

Had a busy week and couldn't really look at this until today. I think this is what is going on, but I tried to build a minimized test demonstrating this behaviour and couldn't reproduce it.

It looks like there are 2 blocks that are important here block5 and block2.

The infinite loop happens after calling begin_predecessors_lookup for block5, which from my understanding is trying to look for a definition for variable 3 in block2, but we don't use varable 3 in block2.

Block: block2 - signature: []
use_var_nonlocal: var: Variable(2)	ty: types::I8	block: block2
	use_var_nonlocal: case: Unsealed(v17)

Block: block5 - signature: []
use_var_nonlocal: var: Variable(3)	ty: types::B1	block: block5
	use_var_nonlocal: case: Unsealed(v22)
use_var_nonlocal: var: Variable(2)	ty: types::I8	block: block5
	use_var_nonlocal: case: Unsealed(v23)
use_var_nonlocal: var: Variable(4)	ty: types::I8	block: block5
	use_var_nonlocal: case: Unsealed(v24)

...

begin_predecessors_lookup: sentinel: v22 - dest_block: block5
	begin_predecessors_lookup: calls: [FinishPredecessorsLookup(v22, block5), UseVar(block2)]

run_state_machine: processing call: UseVar(block2) for var: Variable(3)
	run_state_machine: call queue: [FinishPredecessorsLookup(v22, block5)]
	var_defs: var: Variable(3) - ssa_block: block2 - blockindex: 2 - var_defs - SecondaryMap { elems: [Some(v3), Some(v15), None, Some(v18), Some(v20), Some(v22)], default: None, unused: PhantomData } - var_defs[ssa_block]: None
use_var_nonlocal: var: Variable(3)	ty: types::B1	block: block2
	use_var_nonlocal: case: SealedOnePredecessor(block2)

run_state_machine: processing call: UseVar(block2) for var: Variable(3)
	run_state_machine: call queue: [FinishPredecessorsLookup(v22, block5), FinishSealedOnePredecessor(block2)]
	var_defs: var: Variable(3) - ssa_block: block2 - blockindex: 2 - var_defs - SecondaryMap { elems: [Some(v3), Some(v15), None, Some(v18), Some(v20), Some(v22)], default: None, unused: PhantomData } - var_defs[ssa_block]: None
use_var_nonlocal: var: Variable(3)	ty: types::B1	block: block2
	use_var_nonlocal: case: SealedOnePredecessor(block2)

This then repeats forever increasing the FinishSealedOnePredecessor(block2) cases in the call queue, which is what causes the uncontrollable vec growth.

I've pushed the log calls into a separate commit.

Edit: Had some further progress here.

We are indeed looking for variable 3 in block2 which should be ok, but that's not what is causing the issue. The issue is happening because block2 only has one predecessor which is block2, so we keep recursing through its predecessors causing the infinite loop.

Looking at the code, block2 is indeed the only predecessor to block2, and I assume that we want to mark a block as its own predecessor if it jumps to itself.

So, should we stop the lookup in use_var_nonlocal if we are trying to lookup ourselves again? (seems reasonable to me)

This also brings up another issue, which is: block2 and block5 are unreachable from block0, we never actually jump to either of them, so we are never going to find the definition of variable 3 in their predecessors.

If I understand the paper correctly, they deal with this in Algorithm 3 by DCE'ing these blocks?

cfallin · 2021-08-06T21:11:45Z

Looking at the code, block2 is indeed the only predecessor to block2, and I assume that we want to mark a block as its own predecessor if it jumps to itself.

So, should we stop the lookup in use_var_nonlocal if we are trying to lookup ourselves again? (seems reasonable to me)

Ah -- yes, we need to handle the self-predecessor case specially here, I think. This likely never came up before because the Wasm translator will not generate code of that form.

This also brings up another issue, which is: block2 and block5 are unreachable from block0, we never actually jump to either of them, so we are never going to find the definition of variable 3 in their predecessors.

If I understand the paper correctly, they deal with this in Algorithm 3 by DCE'ing these blocks?

Yes, it looks like they replace any phi whose only input is itself with undef; this makes sense as such a phi could only occur in an unreachable infinite loop (otherwise the loop entry would have to feed into the phi too).

Unreachable code can create all sorts of interesting edge cases in analyses. As one design point, in the regalloc2 fuzzer I took care to only generate reachable blocks in CFGs by construction, and require that for all input code; but I don't think it makes as much sense to create such a mandate in a general-purpose compiler, as it imposes too much on the IR producer. The upshot is that fuzzing like this will certainly expose our unreachable-code-related bugs, so thank you :-)

This issue was found by the CLIF fuzzer in bytecodealliance#3094.

Perform a search over block predecessors trying to find loops of unreachable predecessors. We do this by iterating on predecessors and marking them as visited, stopping if we find a previously visited block or if we find a block with multiple predecessors. This issue was found by the CLIF fuzzer in bytecodealliance#3094.

afonso360 · 2021-09-03T16:53:05Z

@cfallin I can't request a review on GH, but I think with #3258 merged we can start a review on this.

cfallin

This looks great! Thanks for the work in tracking down the SSA-builder issues to enable this, too. I'm interested to see what sorts of new fuzzbugs this might find...

afonso360 mentioned this pull request Jul 18, 2021

Tracking issue for the Cranelift CLIF-level differential fuzzer #3050

Open

32 tasks

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. fuzzing Issues related to our fuzzing infrastructure labels Jul 18, 2021

afonso360 force-pushed the fuzzer-branching branch from c833bc2 to 73ef713 Compare July 20, 2021 11:14

This was referenced Jul 21, 2021

cranelift: Fix trampoline args for b1 types #3102

Merged

cranelift: Resolve alias lookups in interpreter #3109

Merged

afonso360 force-pushed the fuzzer-branching branch from 73ef713 to 37096f5 Compare August 1, 2021 08:24

afonso360 added a commit to afonso360/wasmtime that referenced this pull request Aug 27, 2021

cranelift: Prevent infinite loops in ssa frontend with unreachable code.

4786713

This issue was found by the CLIF fuzzer in bytecodealliance#3094.

afonso360 mentioned this pull request Aug 27, 2021

cranelift: Prevent infinite loops in ssa frontend with unreachable code. #3258

Merged

afonso360 added a commit to afonso360/wasmtime that referenced this pull request Aug 27, 2021

cranelift: Prevent infinite loops in ssa frontend with unreachable code.

0b4e696

This issue was found by the CLIF fuzzer in bytecodealliance#3094.

afonso360 added a commit to afonso360/wasmtime that referenced this pull request Aug 27, 2021

cranelift: Prevent infinite loops in ssa frontend with unreachable code.

0b2ce83

This issue was found by the CLIF fuzzer in bytecodealliance#3094.

afonso360 added 3 commits September 1, 2021 19:43

cranelift: CLIF Fuzzer limit instructions executed in interpreter

fb1201c

cranelift: CLIF Fuzzer generate multiple blocks

f4bd7d1

cranelift: CLIF Fuzzer generate brz/brnz/bricmp instructions

f0f2efb

afonso360 force-pushed the fuzzer-branching branch from 37096f5 to f0f2efb Compare September 1, 2021 19:21

cfallin approved these changes Sep 3, 2021

View reviewed changes

cfallin merged commit 36b7e81 into bytecodealliance:main Sep 3, 2021

afonso360 mentioned this pull request Sep 3, 2021

Cranelift CLIF Fuzzer add jump tables and br_table #3299

Merged

Mrmaxmeier mentioned this pull request Sep 8, 2021

Cranelift: Infinite loop in cranelift-frontend's variable ssa state machine #2903

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cranelift CLIF Fuzzer generate blocks and branches #3094

Cranelift CLIF Fuzzer generate blocks and branches #3094

afonso360 commented Jul 18, 2021 •

edited

Loading

afonso360 commented Jul 18, 2021 •

edited

Loading

github-actions bot commented Jul 18, 2021

cfallin commented Jul 26, 2021

afonso360 commented Aug 1, 2021 •

edited

Loading

cfallin commented Aug 6, 2021

afonso360 commented Sep 3, 2021 •

edited

Loading

cfallin left a comment

Cranelift CLIF Fuzzer generate blocks and branches #3094

Cranelift CLIF Fuzzer generate blocks and branches #3094

Conversation

afonso360 commented Jul 18, 2021 • edited Loading

afonso360 commented Jul 18, 2021 • edited Loading

github-actions bot commented Jul 18, 2021

Subscribe to Label Action

cfallin commented Jul 26, 2021

afonso360 commented Aug 1, 2021 • edited Loading

cfallin commented Aug 6, 2021

afonso360 commented Sep 3, 2021 • edited Loading

cfallin left a comment

Choose a reason for hiding this comment

afonso360 commented Jul 18, 2021 •

edited

Loading

afonso360 commented Jul 18, 2021 •

edited

Loading

afonso360 commented Aug 1, 2021 •

edited

Loading

afonso360 commented Sep 3, 2021 •

edited

Loading