Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift: Remove booleans #5031

Merged
merged 12 commits into from
Oct 17, 2022

Conversation

elliottt
Copy link
Member

@elliottt elliottt commented Oct 7, 2022

Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements.

Fixes #3205

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels Oct 7, 2022
@@ -164,7 +164,6 @@ impl Context for IsleContext<'_, '_, MInst, Flags, IsaFlags, 6> {
fn integral_ty(&mut self, ty: Type) -> Option<Type> {
match ty {
I8 | I16 | I32 | I64 | R64 => Some(ty),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should list R32 too, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, though I'm not sure we should address that on this branch. Would you like to add it in a separate PR?

v1 = bextend.b32 v0
function %bextend_i16_i32(i16) -> i32 {
block0(v0: i16):
v1 = bextend.i32 v0
return v1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can bextend be replaced with uextend (for non-vector bools) and sextend (for vector bools)?

Copy link
Contributor

@afonso360 afonso360 Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favor of removing bextend and breduce. The latter can also be replaced by ireduce.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, if we no longer have bools then these instructions no longer make sense; the user can zero- or sign-extend an integer (which happens to be a 0/1 or 0/-1 "truthy" value) in the usual ways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like using uextend/sextend in place of bextend, and have made that refactoring. I've also removed bconst, bextend, breduce and bint, which has collapsed a lot of cases in the test suite.

@afonso360
Copy link
Contributor

afonso360 commented Oct 7, 2022

I'm really happy that we're finally doing this! Thank you for taking this on.

Some of our lowerings depend on the fact that bools can only be represented as all ones or all zeroes, and so now that we are removing that assumption we might want to revisit them.

The AArch64 implementation for bmask is wrong if we want to accept any integer other than -1 and 0. I would expect to get -1 if I do a bmask on a 2, but that's not what's going to happen. (I haven't tested this, but that's my reading of the implementation).

bint is also assuming that inputs are all ones or all zeros and is implemented as and_imm on AArch64. (i.e. 2 will be false, but should be true)

Edit: AArch64 has instructions (cset and csetm) that represent these two operations, so we may want to partially revert b8f6d53 to get them back

@elliottt elliottt force-pushed the trevor/remove-booleans branch 6 times, most recently from f1c278a to 0ff1557 Compare October 7, 2022 23:21
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow -- thank you for doing the not-very-flashy but very important cleanup work here! It's surprising to me, but validating of this direction, that there are so many little corner cases that this removes.

Some scattered thoughts below; I read fairly carefully early in the diff, but my eyes started to glaze over partway through the test-output deltas, so I'll want to look at those some more in another pass. Hopefully this is good to get started though.

@@ -685,7 +681,7 @@ pub(crate) fn define(
let iflags: &TypeVar = &ValueType::Special(types::Flag::IFlags.into()).into();
let fflags: &TypeVar = &ValueType::Special(types::Flag::FFlags.into()).into();

let b1: &TypeVar = &ValueType::from(LaneType::from(types::Bool::B1)).into();
let b1: &TypeVar = &ValueType::from(LaneType::from(types::Int::I8)).into();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need a typevar named b1 at all? Is this used for "truthy" things like icmp return values etc? If so, perhaps we could call it truthy or somesuch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed it to i8, as that should hopefully give good documentation for the places that it's used.

@@ -157,7 +152,7 @@ impl Type {
/// Scalar types are all converted to `b1` which is usually what you want.
pub fn as_bool(self) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could remove the .as_bool() helper altogether?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about renaming it to .as_truthy() instead? It does convert types like F32 to I32, so it would be convenient to keep something like it around.

cranelift/codegen/src/prelude.isle Outdated Show resolved Hide resolved
cranelift/codegen/src/verifier/mod.rs Show resolved Hide resolved
cranelift/filetests/filetests/isa/aarch64/condbr.clif Outdated Show resolved Hide resolved
@cfallin
Copy link
Member

cfallin commented Oct 7, 2022

Also, just to note it here: if at all possible, I'd prefer for us to wait to merge this until after my mid-end work in #4953 is merged, if only because they're both large and likely-to-conflict PRs and mine has been through ~three painful rebases already :-)

@elliottt elliottt force-pushed the trevor/remove-booleans branch 13 times, most recently from 78b3062 to 863b94e Compare October 12, 2022 01:40
@elliottt
Copy link
Member Author

Following up on myself, it would be even simpler and better to use the existing value_nonzero helper, which will fold in previous comparisons if possible:

 (rule (lower (has_type ty (bmask x)))
       (lower_bool_to_mask ty (value_nonzero x)))

This is much nicer! I'll rework the changes to add lower_bool_to_mask instead.

@uweigand
Copy link
Member

This is much nicer! I'll rework the changes to add lower_bool_to_mask instead.

Thanks! The s390x parts LGTM now.

@elliottt elliottt force-pushed the trevor/remove-booleans branch 2 times, most recently from 9162e71 to 498215a Compare October 17, 2022 21:25
elliottt and others added 12 commits October 17, 2022 14:39
Remove the boolean types from cranelift, and the associated instructions
"breduce", "bextend", "bconst", and "bint". Standardize on using `1`/`0`
for the return value from instructions that produce scalar boolean
results, and `-1`/`0` for boolean vector elements.

Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com>
Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
@elliottt elliottt merged commit 32a7593 into bytecodealliance:main Oct 17, 2022
cfallin added a commit to cfallin/wasmtime that referenced this pull request Dec 22, 2022
In bytecodealliance#5031, we removed `bool` types from CLIF, using integers instead for
"truthy" values. This greatly simplified the IR, and was generally an
improvement.

However, because x86's `SETcc` instruction sets only the low 8 bits of a
register, we chose to use `i8` types as the result of `icmp` and `fcmp`,
to avoid the need for a masking operation when materializing the result.

Unfortunately this means that uses of truthy values often now have
`uextend` operations, especially when coming from Wasm (where truthy
values are naturally `i32`-typed). For example, where we previously had
`(brz (icmp ...))`, we now have `(brz (uextend (icmp ...)))`.

It's arguable whether or not we should switch to `i32` truthy values --
in most cases we can avoid materializing a value that's immediately used
for a branch or select, so a mask would in most cases be unnecessary,
and it would be a win at the IR level -- but irrespective of that, this
change *did* regress our generated code quality: our backends had
patterns for e.g. `(brz (icmp ...))` but not with the `uextend`, so we
were *always* materializing truthy values. Many blocks thus ended with
"cmp; setcc; cmp; test; branch" rather than "cmp; branch".

In bytecodealliance#5391 we noticed this and fixed it on x64, but it was a general
problem on aarch64 and riscv64 as well. This PR introduces a
`maybe_uextend` extractor that "looks through" uextends, and uses it
where we consume truthy values, thus fixing the regression.  This PR
also adds compile filetests to ensure we don't regress again.
cfallin added a commit to cfallin/wasmtime that referenced this pull request Dec 22, 2022
In bytecodealliance#5031, we removed `bool` types from CLIF, using integers instead for
"truthy" values. This greatly simplified the IR, and was generally an
improvement.

However, because x86's `SETcc` instruction sets only the low 8 bits of a
register, we chose to use `i8` types as the result of `icmp` and `fcmp`,
to avoid the need for a masking operation when materializing the result.

Unfortunately this means that uses of truthy values often now have
`uextend` operations, especially when coming from Wasm (where truthy
values are naturally `i32`-typed). For example, where we previously had
`(brz (icmp ...))`, we now have `(brz (uextend (icmp ...)))`.

It's arguable whether or not we should switch to `i32` truthy values --
in most cases we can avoid materializing a value that's immediately used
for a branch or select, so a mask would in most cases be unnecessary,
and it would be a win at the IR level -- but irrespective of that, this
change *did* regress our generated code quality: our backends had
patterns for e.g. `(brz (icmp ...))` but not with the `uextend`, so we
were *always* materializing truthy values. Many blocks thus ended with
"cmp; setcc; cmp; test; branch" rather than "cmp; branch".

In bytecodealliance#5391 we noticed this and fixed it on x64, but it was a general
problem on aarch64 and riscv64 as well. This PR introduces a
`maybe_uextend` extractor that "looks through" uextends, and uses it
where we consume truthy values, thus fixing the regression.  This PR
also adds compile filetests to ensure we don't regress again.
cfallin added a commit to cfallin/wasmtime that referenced this pull request Dec 22, 2022
In bytecodealliance#5031, we removed `bool` types from CLIF, using integers instead for
"truthy" values. This greatly simplified the IR, and was generally an
improvement.

However, because x86's `SETcc` instruction sets only the low 8 bits of a
register, we chose to use `i8` types as the result of `icmp` and `fcmp`,
to avoid the need for a masking operation when materializing the result.

Unfortunately this means that uses of truthy values often now have
`uextend` operations, especially when coming from Wasm (where truthy
values are naturally `i32`-typed). For example, where we previously had
`(brz (icmp ...))`, we now have `(brz (uextend (icmp ...)))`.

It's arguable whether or not we should switch to `i32` truthy values --
in most cases we can avoid materializing a value that's immediately used
for a branch or select, so a mask would in most cases be unnecessary,
and it would be a win at the IR level -- but irrespective of that, this
change *did* regress our generated code quality: our backends had
patterns for e.g. `(brz (icmp ...))` but not with the `uextend`, so we
were *always* materializing truthy values. Many blocks thus ended with
"cmp; setcc; cmp; test; branch" rather than "cmp; branch".

In bytecodealliance#5391 we noticed this and fixed it on x64, but it was a general
problem on aarch64 and riscv64 as well. This PR introduces a
`maybe_uextend` extractor that "looks through" uextends, and uses it
where we consume truthy values, thus fixing the regression.  This PR
also adds compile filetests to ensure we don't regress again.

The riscv64 backend has not been updated here because doing so appears
to trigger another issue in its branch handling; fixing that is TBD.
cfallin added a commit that referenced this pull request Dec 22, 2022
#5487)

In #5031, we removed `bool` types from CLIF, using integers instead for
"truthy" values. This greatly simplified the IR, and was generally an
improvement.

However, because x86's `SETcc` instruction sets only the low 8 bits of a
register, we chose to use `i8` types as the result of `icmp` and `fcmp`,
to avoid the need for a masking operation when materializing the result.

Unfortunately this means that uses of truthy values often now have
`uextend` operations, especially when coming from Wasm (where truthy
values are naturally `i32`-typed). For example, where we previously had
`(brz (icmp ...))`, we now have `(brz (uextend (icmp ...)))`.

It's arguable whether or not we should switch to `i32` truthy values --
in most cases we can avoid materializing a value that's immediately used
for a branch or select, so a mask would in most cases be unnecessary,
and it would be a win at the IR level -- but irrespective of that, this
change *did* regress our generated code quality: our backends had
patterns for e.g. `(brz (icmp ...))` but not with the `uextend`, so we
were *always* materializing truthy values. Many blocks thus ended with
"cmp; setcc; cmp; test; branch" rather than "cmp; branch".

In #5391 we noticed this and fixed it on x64, but it was a general
problem on aarch64 and riscv64 as well. This PR introduces a
`maybe_uextend` extractor that "looks through" uextends, and uses it
where we consume truthy values, thus fixing the regression.  This PR
also adds compile filetests to ensure we don't regress again.

The riscv64 backend has not been updated here because doing so appears
to trigger another issue in its branch handling; fixing that is TBD.
sampsyo added a commit to sampsyo/bril that referenced this pull request May 21, 2023
The B1 type was removed in:
bytecodealliance/wasmtime#5031

This is nice; it removes some weird special-cases having to do with
converting booleans to bytes when loading/storing. Now booleans are just
I8s.
aDotInTheVoid added a commit to aDotInTheVoid/wasmtime that referenced this pull request Apr 18, 2024
github-merge-queue bot pushed a commit that referenced this pull request Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cranelift: formalize, document, and audit binary representation of booleans
5 participants