Skip to content

Commit

Permalink
PCC: draw the rest of the owl: fully-working PCC on hello-world Wasm …
Browse files Browse the repository at this point in the history
…on aarch64 (#7281)

* PCC: draw the rest of the owl: fully-working PCC on hello-world Wasm on aarch64

This needed a bit more inference / magic than I was hoping for at first,
specifically around constants and adds. Some instructions can now
generate facts on their output registers, even if not stated. This
breaks away from the "breadcrumbs" idea, but seems to be the most
practical solution to a general problem that we have mini-lowering steps
in various places without careful preservation of PCC facts. Two
particular aspects:

- Constants: amodes on aarch64 can decompose into new
  constant-generation instructions, and we need precise ranges on those
  to properly check them. To avoid making the ISLE rules nightmarish,
  it's best to reuse the existing semantics definitions of the Add* ALU
  insts, and add a few rules for MovK/MovZ/MovN.

- Adds: similarly, amodes decompose into de-novo add instructions with
  no facts. To handle this, there's now a notion of "propagating" facts
  that cause an instruction with a propagating fact on the input to
  generate a fact on the output.

Together, these heuristics mean that we'll eagerly generate a fact for
`mem(mt0, 0, 0) + 8 -> mem(mt0, 8, 8)`, but we won't, for example,
generate ranges on every single integer operation.

With these changes and a few other misc fixes, this PR can now check a
nontrivial "hello world" Wasm on aarch64 down to the machine-code level:

```
$ target/release/wasmtime compile -C enable-pcc=y ../wasm-tests/helloworld-rs.wasm
```

* Review feedback.
  • Loading branch information
cfallin authored Oct 19, 2023
1 parent d86afc0 commit 8d19280
Show file tree
Hide file tree
Showing 63 changed files with 911 additions and 588 deletions.
1 change: 1 addition & 0 deletions cranelift/codegen/src/egraph.rs
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,7 @@ where
new_result
);
self.value_to_opt_value[result] = new_result;
self.func.dfg.merge_facts(result, new_result);
true
}
// Otherwise, generic side-effecting op -- always keep it, and
Expand Down
24 changes: 20 additions & 4 deletions cranelift/codegen/src/ir/dfg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -959,7 +959,13 @@ impl DataFlowGraph {
// Get the controlling type variable.
let ctrl_typevar = self.ctrl_typevar(inst);
// Create new result values.
self.make_inst_results(new_inst, ctrl_typevar);
let num_results = self.make_inst_results(new_inst, ctrl_typevar);
// Copy over PCC facts, if any.
for i in 0..num_results {
let old_result = self.inst_results(inst)[i];
let new_result = self.inst_results(new_inst)[i];
self.facts[new_result] = self.facts[old_result].clone();
}
new_inst
}

Expand Down Expand Up @@ -1301,9 +1307,19 @@ impl DataFlowGraph {
(None, Some(b)) => {
self.facts[a] = Some(b.clone());
}
_ => {
self.facts[a] = Some(Fact::Conflict);
self.facts[b] = Some(Fact::Conflict);
(Some(a_fact), Some(b_fact)) => {
assert_eq!(self.value_type(a), self.value_type(b));
let merged = Fact::intersect(a_fact, b_fact);
crate::trace!(
"facts merge on {} and {}: {:?}, {:?} -> {:?}",
a,
b,
a_fact,
b_fact,
merged,
);
self.facts[a] = Some(merged.clone());
self.facts[b] = Some(merged);
}
}
}
Expand Down
154 changes: 127 additions & 27 deletions cranelift/codegen/src/ir/pcc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@
use crate::ir;
use crate::ir::types::*;
use crate::isa::TargetIsa;
use crate::machinst::{BlockIndex, LowerBackend, MachInst, VCode};
use crate::machinst::{BlockIndex, LowerBackend, VCode};
use crate::trace;
use regalloc2::Function as _;
use std::fmt;
Expand Down Expand Up @@ -190,29 +190,43 @@ impl fmt::Display for Fact {
}

impl Fact {
/// Try to infer a minimal fact for a value of the given IR type.
pub fn infer_from_type(ty: ir::Type) -> Option<&'static Self> {
static FACTS: [Fact; 4] = [
Fact::Range {
bit_width: 8,
min: 0,
max: u8::MAX as u64,
},
Fact::Range {
bit_width: 16,
min: 0,
max: u16::MAX as u64,
},
Fact::Range {
bit_width: 32,
/// Create a range fact that specifies a single known constant value.
pub fn constant(bit_width: u16, value: u64) -> Self {
debug_assert!(value <= max_value_for_width(bit_width));
// `min` and `max` are inclusive, so this specifies a range of
// exactly one value.
Fact::Range {
bit_width,
min: value,
max: value,
}
}

/// Create a range fact that specifies the maximum range for a
/// value of the given bit-width.
pub const fn max_range_for_width(bit_width: u16) -> Self {
match bit_width {
bit_width if bit_width < 64 => Fact::Range {
bit_width,
min: 0,
max: u32::MAX as u64,
max: (1u64 << bit_width) - 1,
},
Fact::Range {
64 => Fact::Range {
bit_width: 64,
min: 0,
max: u64::MAX,
},
_ => panic!("bit width too large!"),
}
}

/// Try to infer a minimal fact for a value of the given IR type.
pub fn infer_from_type(ty: ir::Type) -> Option<&'static Self> {
static FACTS: [Fact; 4] = [
Fact::max_range_for_width(8),
Fact::max_range_for_width(16),
Fact::max_range_for_width(32),
Fact::max_range_for_width(64),
];
match ty {
I8 => Some(&FACTS[0]),
Expand All @@ -222,6 +236,80 @@ impl Fact {
_ => None,
}
}

/// Does this fact "propagate" automatically, i.e., cause
/// instructions that process it to infer their own output facts?
/// Not all facts propagate automatically; otherwise, verification
/// would be much slower.
pub fn propagates(&self) -> bool {
match self {
Fact::Mem { .. } => true,
_ => false,
}
}

/// Is this a constant value of the given bitwidth? Return it as a
/// `Some(value)` if so.
pub fn as_const(&self, bits: u16) -> Option<u64> {
match self {
Fact::Range {
bit_width,
min,
max,
} if *bit_width == bits && min == max => Some(*min),
_ => None,
}
}

/// Merge two facts. We take the *intersection*: that is, we know
/// both facts to be true, so we can intersect ranges. (This
/// differs from the usual static analysis approach, where we are
/// merging multiple possibilities into a generalized / widened
/// fact. We want to narrow here.)
pub fn intersect(a: &Fact, b: &Fact) -> Fact {
match (a, b) {
(
Fact::Range {
bit_width: bw_lhs,
min: min_lhs,
max: max_lhs,
},
Fact::Range {
bit_width: bw_rhs,
min: min_rhs,
max: max_rhs,
},
) if bw_lhs == bw_rhs && max_lhs >= min_rhs && max_rhs >= min_lhs => Fact::Range {
bit_width: *bw_lhs,
min: std::cmp::max(*min_lhs, *min_rhs),
max: std::cmp::min(*max_lhs, *max_rhs),
},

(
Fact::Mem {
ty: ty_lhs,
min_offset: min_offset_lhs,
max_offset: max_offset_lhs,
},
Fact::Mem {
ty: ty_rhs,
min_offset: min_offset_rhs,
max_offset: max_offset_rhs,
},
) if ty_lhs == ty_rhs
&& max_offset_lhs >= min_offset_rhs
&& max_offset_rhs >= min_offset_lhs =>
{
Fact::Mem {
ty: *ty_lhs,
min_offset: std::cmp::max(*min_offset_lhs, *min_offset_rhs),
max_offset: std::cmp::min(*max_offset_lhs, *max_offset_rhs),
}
}

_ => Fact::Conflict,
}
}
}

macro_rules! ensure {
Expand Down Expand Up @@ -283,6 +371,23 @@ impl<'a> FactContext<'a> {
bw_lhs == bw_rhs && max_lhs <= max_rhs && min_lhs >= min_rhs
}

(
Fact::Mem {
ty: ty_lhs,
min_offset: min_offset_lhs,
max_offset: max_offset_lhs,
},
Fact::Mem {
ty: ty_rhs,
min_offset: min_offset_rhs,
max_offset: max_offset_rhs,
},
) => {
ty_lhs == ty_rhs
&& max_offset_lhs <= max_offset_rhs
&& min_offset_lhs >= min_offset_rhs
}

_ => false,
}
}
Expand Down Expand Up @@ -497,6 +602,7 @@ impl<'a> FactContext<'a> {
/// that this address accesses, if known, or `None` if the range
/// doesn't constrain the access to exactly one location.
fn check_address(&self, fact: &Fact, size: u32) -> PccResult<Option<(ir::MemoryType, u64)>> {
trace!("check_address: fact {:?} size {}", fact, size);
match fact {
Fact::Mem {
ty,
Expand Down Expand Up @@ -592,7 +698,7 @@ fn max_value_for_width(bits: u16) -> u64 {
/// VCode.
pub fn check_vcode_facts<B: LowerBackend + TargetIsa>(
f: &ir::Function,
vcode: &VCode<B::MInst>,
vcode: &mut VCode<B::MInst>,
backend: &B,
) -> PccResult<()> {
let ctx = FactContext::new(f, backend.triple().pointer_width().unwrap().bits().into());
Expand All @@ -602,14 +708,8 @@ pub fn check_vcode_facts<B: LowerBackend + TargetIsa>(
for block in 0..vcode.num_blocks() {
let block = BlockIndex::new(block);
for inst in vcode.block_insns(block).iter() {
if vcode.inst_defines_facts(inst) || vcode[inst].is_mem_access() {
// This instruction defines a register with a new fact, or
// has some side-effect we want to be careful to
// verify. We'll call into the backend to validate this
// fact with respect to the instruction and the input
// facts.
backend.check_fact(&ctx, vcode, &vcode[inst])?;
}
// Check any output facts on this inst.
backend.check_fact(&ctx, vcode, inst)?;

// If this is a branch, check that all block arguments subsume
// the assumed facts on the blockparams of successors.
Expand Down
4 changes: 2 additions & 2 deletions cranelift/codegen/src/isa/aarch64/lower.rs
Original file line number Diff line number Diff line change
Expand Up @@ -134,8 +134,8 @@ impl LowerBackend for AArch64Backend {
fn check_fact(
&self,
ctx: &FactContext<'_>,
vcode: &VCode<Self::MInst>,
inst: &Self::MInst,
vcode: &mut VCode<Self::MInst>,
inst: InsnIndex,
) -> PccResult<()> {
pcc::check(ctx, vcode, inst)
}
Expand Down
Loading

0 comments on commit 8d19280

Please sign in to comment.