ABI change in `extendhfsf2` and `truncsfhf2` on X86 #56854

kparzysz-quic · 2022-08-01T14:19:44Z

See also: https://discourse.llvm.org/t/how-to-build-compiler-rt-for-new-x86-half-float-abi/63366/15

Support for _Float16 was added in 15.0.0 (based on target capabilities). This had the unfortunate consequence that it changed the way calls to __truncsfhf2 and __extendhfsf2 are handled: in prior version of LLVM, or in the absence of support for _Float16, the half-precision floating point value was passed as uint16_t. In the x86 calling convention that type is passed in RDI, whereas _Float16 values are passed in XMM0. This introduces binary incompatibility between LLVM 15 and prior versions.

The text was updated successfully, but these errors were encountered:

llvmbot · 2022-08-01T14:23:10Z

@llvm/issue-subscribers-backend-x86

tru · 2022-08-01T14:42:54Z

This seems like it would be a blocker. Who worked on Float16 and can chime in?

phoebewang · 2022-08-01T14:46:26Z

@kparzysz-quic Which platform you are using? AFAIK, only Darwin uses __truncsfhf2 and __extendhfsf2 before 15.0. Other platforms use __gnu_h2f_ieee and __gnu_f2h_ieee. https://godbolt.org/z/ccbbET9d1

phoebewang · 2022-08-01T14:50:38Z

This seems like it would be a blocker. Who worked on Float16 and can chime in?

I worked on that. I don't think this is a blocker. The change of the ABI is expected. GCC 12 uses the same ABI as well. https://godbolt.org/z/bc4xx5zoa

kparzysz-quic · 2022-08-01T15:35:36Z

To recap: I work on a project where LLVM libraries are used as code generator in another application.

The application has the LLVM libraries linked into it, but it uses external toolchains to link them.
If the application had LLVM 15+ with _Float16 support in it, the code generator would generate calls to __truncsfhf2 and __extendhfsf2. These calls would assume floating-point ABI.
If the external toolchain is clang14 or earlier, the included compiler-rt will have __truncsfhf2 and __extendhfsf2 that assume integer ABI. The code will link, but will give incorrect output.

phoebewang · 2022-08-01T15:59:41Z

I see your point, just wondering if it is an inherent problem for compiler-rt if codegen doing ABI break change.
OTOH, the external toolchain also might be GCC 12. For such case, we still have to generate calls to these function with floating-point ABI.

kparzysz-quic · 2022-08-01T16:11:49Z

I guess a flag (cl::opt, not driver flag) that would force the x86 backend to always use the __gnu_(h2f|f2h)_ieee functions would help. Would it be possible/reasonable to add it?

phoebewang · 2022-08-02T15:42:10Z

It's possible to do it in compiler, but there's still problem in compiler-rt. Because __gnu_(h2f|f2h)_ieee is just an alias of __truncsfhf2/__extendhfsf2: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/builtins/truncsfhf2.c#L19
It means we also need to detach them and make __gnu_(h2f|f2h)_ieee always use uint16_t. But we can't simply do that because ARM/AArch64 have already using both for almost 2 years. See D91732, D91733. It means we need to make sure all the current user scenario combined with targets (like ARM/AArch64) x platforms (Linux/Darwin/Windows) won't be affected by such a change too.

Thinking it again. I doubt if it really a reasonable user scenario in your case. We should always use the same compiler-rt with the same compiler version. E.g., D126953 introduced a libcall __truncsfbf2 lowering in compiler and its implementation in compiler-rt. You cannot use an older version compiler-rt for it.
Another example D107245.
So you cannot assume it always work when using old compiler-rt with newer compiler, either we may get compile fail or unexpected result. That said, the problem is inherent somehow, thus we don't necessarily to solve this problem.

kparzysz-quic · 2022-08-02T16:07:40Z

I think the best thing ultimately (for TVM) would be to just emit the conversion code directly into the module...

phoebewang · 2022-08-03T07:13:17Z

The best way is keep the compiler-rt the same version as compiler.
If it is not possible, a workaround is to avoid lowering to libcalls. The native F16C instructions are avaliable in Ivy Bridge and latter processors. For these targets, we just need to use option -march=ivybridge(or up) or -mf16c in Clang front-end and/or -mcpu=ivybridge or "target-features"="+f16c" in the backend.

tru · 2022-08-03T10:13:00Z

Just so I am following along correctly - is it something we want to try to fix before 15 - or can I remove this as a release blocker?

phoebewang · 2022-08-03T13:53:21Z

Yes. I think so. Maybe we can close it @kparzysz-quic ?

kparzysz-quic · 2022-08-03T14:12:49Z

It doesn't seem like this is something that can be fixed, but maybe Saleem (@compnerd) could chime in before we close this.

compnerd · 2022-08-03T14:25:42Z

The compiler-rt builtins are supposed to match the GCC ABI. We are matching the GCC ABI so this seems like the correct behaviour. Overall, the builtins are supposed to have a pretty stable ABI, so it is rather unfortunate that this broke. Replacing the implementation for __gnu_{h2f,f2h}_ieee with an unaliased version is reasonable as long as we continue to conform to the ABI.

I am left wondering how bad of a problem would this be in practice. I haven't given it too much thought, but perhaps we can get away with something like providing both variants and an alias, associated with a comdat symbol and have the compiler select between the two if there is FP16 usage without FP16 hardware support. This would be a pretty hefty penalty for the compiler (and would potentially leak some impact into link times as well). I am rather reluctant to go down that path.

kparzysz-quic · 2022-08-03T14:50:44Z

The worst outcome is that user code compiles and links, but gives a wrong answer. Something like what we do for ABI breaking checks (i.e. introducing version symbols and forcing a linking error) would be nice, but given that we want to be able to link with GCC's runtime, this doesn't seem like a solution either.

I don't want to impose any restrictions on compiler-rt that would hinder LLVM or clang, so maybe we're stuck with the current situation. I just wanted to make sure that we don't leave any option unexplored.

compnerd · 2022-08-03T14:53:51Z

I think that build time checks would be able to alleviate that concern (à la autotools). It should be possible to craft a simple test that helps identify which variant of the ABI is in use and then either abort or change the code generation.

kparzysz-quic · 2022-08-03T14:59:01Z

That may be the way out of this.

Anyway, we should document this change in the release notes, if it's not there yet.

Thanks everyone for your inputs.

phoebewang · 2022-08-04T08:05:02Z

we should document this change in the release notes, if it's not there yet.

There's a very basic note there: https://github.com/llvm/llvm-project/blob/release/15.x/llvm/docs/ReleaseNotes.rst#changes-to-the-x86-backend
It's good if we can highlight the compiler-rt issue, but I don't know how to change it once branched. @tstellar

tru · 2022-08-04T09:00:31Z

@phoebewang you can do the change directly on the release/15.x branch and we will take care of the rest.

phoebewang · 2022-08-04T09:18:59Z

Yeah, I can put it in my fork repo and then cherry-pick. Thanks @tru.

phoebewang · 2022-08-04T14:33:54Z

I put a patch D131172.

See llvm/llvm-project#56854 for more details.

#56854 shows a backwards compatibility problem when builtins of compiler-rt don't follow ABI. We need to prevent to fall into the trap again for BF16. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D131147

compiler-rt 15 has a breaking ABI change affecting how half precision floating point values are passed between functions (llvm/llvm-project#56854). For reasons I don't fully understand, different versions of clang exhibit the issue on different OSes. For example on Ubuntu 22.04 clang-15 is affected and clang-16 is OK, but on Fedora clang-16 is affected. clang-16 is also the default on Fedora meaning that all the half precision tests fail with random numbers because they're reading garbage. There's a one-liner to test for this issue in the LLVM project, so make use of that to tell if the CLANG backend will produce correct results.

…e can Taking a hint from llvm/llvm-project#56854 (comment) we can make use of `-march=native` flag to replace the library call with a native instruction on Ivy Bridge and above. This doesn't fix the bug in compiler-rt, but it means that we can support CLANG on more hardware. Note that the usual issues of `-march=native` shouldn't affect us since we only compile these at runtime and we're building on the same machine that we're running on. I've not measured if this improves the speed, only that the tests now pass.

compiler-rt 15 has a breaking ABI change affecting how half precision floating point values are passed between functions (llvm/llvm-project#56854). For reasons I don't fully understand, different versions of clang exhibit the issue on different OSes. For example on Ubuntu 22.04 clang-15 is affected and clang-16 is OK, but on Fedora clang-16 is affected. clang-16 is also the default on Fedora meaning that all the half precision tests fail with random numbers because they're reading garbage. There's a one-liner to test for this issue in the LLVM project, so make use of that to tell if the CLANG backend will produce correct results.

…e can Taking a hint from llvm/llvm-project#56854 (comment) we can make use of `-march=native` flag to replace the library call with a native instruction on Ivy Bridge and above. This doesn't fix the bug in compiler-rt, but it means that we can support CLANG on more hardware. Note that the usual issues of `-march=native` shouldn't affect us since we only compile these at runtime and we're building on the same machine that we're running on. I've not measured if this improves the speed, only that the tests now pass.

Bug: llvm/llvm-project#69842 Bug: gentoo#33400 Reference: llvm/llvm-project#56854 Signed-off-by: Benda Xu <heroxbd@gentoo.org>

Closes: https://bugs.gentoo.org/916069 Bug: llvm/llvm-project#69842 Bug: gentoo#33400 Reference: llvm/llvm-project#56854 Signed-off-by: Benda Xu <heroxbd@gentoo.org>

github-actions bot added the new issue label Aug 1, 2022

kparzysz-quic added this to the LLVM 15.0.0 Release milestone Aug 1, 2022

EugeneZelenko added backend:X86 ABI Application Binary Interface and removed new issue labels Aug 1, 2022

kparzysz-quic closed this as completed Aug 3, 2022

nikic mentioned this issue Aug 4, 2022

LLVM 15 regression: fpext half to fp128 #56911

Closed

andrewrk added a commit to ziglang/zig that referenced this issue Aug 4, 2022

compiler_rt: update ABI for x86 float16 functions

169ad1a

See llvm/llvm-project#56854 for more details.

phoebewang mentioned this issue Aug 11, 2022

PR for llvm/llvm-project#57080 llvm/llvm-project-release-prs#93

Merged

franz mentioned this issue Feb 7, 2023

fp16 related build issues on JLSE CHIP-SPV/chipStar#352

Closed

pjaaskel mentioned this issue Jul 14, 2023

Error when running benchmark - undefined reference to __extendhfsf2 CHIP-SPV/chipStar#524

Closed

sogartar mentioned this issue Aug 2, 2023

link error: undefined symbol: __truncsfhf2 iree-org/iree#14549

Closed

dudeofea mentioned this issue Aug 7, 2023

EXLA Nx.as_type(:f16) broken in livebook docker image livebook-dev/livebook#2147

Closed

scchan mentioned this issue Aug 25, 2023

rocFFT Test Suite Fails ROCm/rocFFT#439

Closed

jan-wassenberg mentioned this issue Sep 6, 2023

Can't build with Clang 16 google/highway#1709

Closed

heroxbd added a commit to heroxbd/gentoo that referenced this issue Nov 19, 2023

sys-libs/compiler-rt: float16 ABI build condition.

ba0c087

Bug: llvm/llvm-project#69842 Bug: gentoo#33400 Reference: llvm/llvm-project#56854 Signed-off-by: Benda Xu <heroxbd@gentoo.org>

heroxbd mentioned this issue Nov 19, 2023

sys-libs/compiler-rt: float16 ABI build condition. gentoo/gentoo#33900

Closed

usamoi mentioned this issue Dec 11, 2023

linking with C library changes the behavior of _Float16 rust-lang/rust#118813

Closed

jan-wassenberg mentioned this issue Dec 14, 2023

Made enhancements to BitCastScalar and fixed F16 compare operators google/highway#1884

Merged

laytan mentioned this issue Mar 25, 2024

f16 comparisons fail on Intel MacOS + LLVM17 odin-lang/Odin#3222

Closed

tgross35 mentioned this issue Apr 14, 2024

f16 generates code that uses the incorrect ABI for compiler-rt rust-lang/rust#123885

Closed

jan-wassenberg mentioned this issue Aug 14, 2024

Add google-highway mesonbuild/wrapdb#1611

Draft

Lunderberg mentioned this issue Aug 22, 2024

[CI][Windows] Workaround for error in Findzstd.cmake apache/tvm#17283

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ABI change in `extendhfsf2` and `truncsfhf2` on X86 #56854

ABI change in `extendhfsf2` and `truncsfhf2` on X86 #56854

kparzysz-quic commented Aug 1, 2022

llvmbot commented Aug 1, 2022

tru commented Aug 1, 2022

phoebewang commented Aug 1, 2022

phoebewang commented Aug 1, 2022

kparzysz-quic commented Aug 1, 2022

phoebewang commented Aug 1, 2022

kparzysz-quic commented Aug 1, 2022

phoebewang commented Aug 2, 2022

kparzysz-quic commented Aug 2, 2022

phoebewang commented Aug 3, 2022

tru commented Aug 3, 2022

phoebewang commented Aug 3, 2022

kparzysz-quic commented Aug 3, 2022

compnerd commented Aug 3, 2022

kparzysz-quic commented Aug 3, 2022

compnerd commented Aug 3, 2022

kparzysz-quic commented Aug 3, 2022

phoebewang commented Aug 4, 2022

tru commented Aug 4, 2022

phoebewang commented Aug 4, 2022

phoebewang commented Aug 4, 2022

ABI change in __extendhfsf2 and __truncsfhf2 on X86 #56854

ABI change in __extendhfsf2 and __truncsfhf2 on X86 #56854

Comments

kparzysz-quic commented Aug 1, 2022

llvmbot commented Aug 1, 2022

tru commented Aug 1, 2022

phoebewang commented Aug 1, 2022

phoebewang commented Aug 1, 2022

kparzysz-quic commented Aug 1, 2022

phoebewang commented Aug 1, 2022

kparzysz-quic commented Aug 1, 2022

phoebewang commented Aug 2, 2022

kparzysz-quic commented Aug 2, 2022

phoebewang commented Aug 3, 2022

tru commented Aug 3, 2022

phoebewang commented Aug 3, 2022

kparzysz-quic commented Aug 3, 2022

compnerd commented Aug 3, 2022

kparzysz-quic commented Aug 3, 2022

compnerd commented Aug 3, 2022

kparzysz-quic commented Aug 3, 2022

phoebewang commented Aug 4, 2022

tru commented Aug 4, 2022

phoebewang commented Aug 4, 2022

phoebewang commented Aug 4, 2022

ABI change in `extendhfsf2` and `truncsfhf2` on X86 #56854

ABI change in `extendhfsf2` and `truncsfhf2` on X86 #56854