Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cranelift: implement bmask instruction #1429

Closed
bjorn3 opened this issue Mar 28, 2020 · 7 comments
Closed

Cranelift: implement bmask instruction #1429

bjorn3 opened this issue Mar 28, 2020 · 7 comments
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Comments

@bjorn3
Copy link
Contributor

bjorn3 commented Mar 28, 2020

There is currently neither an encoding, nor a legalization for this instruction.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Mar 28, 2020

I think it should be used by translate_vector_icmp and translate_vector_fcmp after an encoding is added.

@abrown
Copy link
Contributor

abrown commented Mar 28, 2020

Not sure I follow: I do plan to eventually add a SIMD bitmask using PMOVMSKB like what is described in the SIMD spec issue but I don't see how that would change translate_vector_icmp and translate_vector_fcmp--it would be a separate instruction.

@bjorn3
Copy link
Contributor Author

bjorn3 commented Mar 28, 2020

According to the docs the instruction would for every lane return an all ones integer for a true bool and an all zeros integer for a false bool. Eg bmask.i8x2 on [false, true] would return [0x00, 0xff].

translate_vector_{icmp,fcmp} currently assume that icmp and fcmp on vectors represent the resulting bool using all zeros / all ones. Otherwise the raw_bitcast would fail. This is not guaranteed anywhere and for non vectors this assumption is wrong. Using eg bmask.i8x16 to convert a b8x16 instead of using raw_bitcast.i8x16 would avoid making this assumption.

@abrown
Copy link
Contributor

abrown commented Mar 28, 2020

As I understand it, on x86 and ARM the output of vector icmp and fcmp is another vector with lanes of all zeroes or all ones. Are you suggesting using bmask.i8x16 in translate_vector_{icmp,fcmp} to make things look more type-correct but actually encode it as a noop on those platforms?

@bjorn3
Copy link
Contributor Author

bjorn3 commented Mar 28, 2020

Yes

@bnjbvr bnjbvr added the cranelift Issues related to the Cranelift code generator label Apr 1, 2020
@github-actions
Copy link

github-actions bot commented Apr 1, 2020

Subscribe to Label Action

This issue or pull request has been labeled: "cranelift"

Users Subscribed to "cranelift"

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@cfallin
Copy link
Member

cfallin commented May 4, 2022

Closing this in favor of #3205, which encompasses the general topic of our vector-bool representation.

@cfallin cfallin closed this as completed May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

No branches or pull requests

5 participants