-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
openexr 3.1.6 regression: does not compile on ARM v7 due to unguarded use of unavailable AARCH64 intrinsics #1365
Comments
Is this fixed by #1358? If not, better guards are in order. |
@cary-ilm I don't have ARMv7 around, I am relaying a report from a FreeBSD fellow; reading #1358 it looks as though the detection would be helpful, but since that emulates only vst1q... but not vqtbl1q... I don't see how it could possibly help. |
@mandree I don't have an ARM device to test with either, relying on others to do the leg-work. @aras-p, what solution to you recommend? The substitute intrinsic that @betajippity provided in #1358 looks close but there's more missing, right? Thanks for any help in getting this sorted out. |
My take is that some #ifdef/#define work would be in order that sets ILM_HAVE_NEON_AARCH64 or similar, and makes the vzip and vqtbl stuff depend on that rather than just ILM_HAVE_NEON. If done as bluntly as suggested here, that would make the optimization unavailable on ARMv7. I have sketched this proposal in two attachments to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270348 - but unless there are newer comments in that bug report, these are untested, so I do not know if they are doing the right thing. |
Upstream bug report: AcademySoftwareFoundation/openexr#1365 PR: 270348
Ah I did not realize that anyone would build OpenEXR on armv7. My bad, sorry! I think I'd go for just not using the NEON code path there (i.e. only enable it on armv8/aarch64); that way it stays the same as it was before #1348. I'll try to submit a fix today. |
@aras-p OpenEXR appears as a transitive dependency in some desktop software. E.g. ImageMagick and windowmaker depend on it. |
…ch64) only Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths for ZIP filtering. But that code uses several instructions that are ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7) platforms. Make these optimizations only kick in when building for 64-bit ARM platforms.
…ch64) only Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths for ZIP filtering. But that code uses several instructions that are ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7) platforms. Make these optimizations only kick in when building for 64-bit ARM platforms. Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
…ch64) only (#1366) Should fix #1365. Recent PR (#1348) added NEON accelerated code paths for ZIP filtering. But that code uses several instructions that are ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7) platforms. Make these optimizations only kick in when building for 64-bit ARM platforms. Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
* Enable fast Huffman decoding on macOS Enable fast Huffman decoding for macOS (x86 and Apple silicon) Signed-off-by: Developer Ecosystem Engineering <DeveloperEcosystemEngineering@apple.com> * Implement Huffman zig-zag transform Implements Huffman zig-zag transform and 32 to 16 bit floating point Signed-off-by: Developer Ecosystem Engineering <DeveloperEcosystemEngineering@apple.com> Signed-off-by: Developer Ecosystem Engineering <DeveloperEcosystemEngineering@apple.com>
OpenEXR 3.1.6 introduced several NEON-based optimizations that implied Aarch64. Add patched, either picked from upstream, or written by mandree@, to enable those NEON features that also require Aarch64 only there. PR-1366 is cherry-picked from upstream, and patch-lib/patch-test files are my work but build upon said PR. Also cherry-pick PR1354 that adds a missing check for AVX, which is why I am bumping PORTREVISION because it might change code (I have not checked). AcademySoftwareFoundation/openexr#1365 PR: 270348 Reported by: fuz@ (Robert Clausecker)
OpenEXR 3.1.6 introduced several NEON-based optimizations that implied Aarch64. Add patched, either picked from upstream, or written by mandree@, to enable those NEON features that also require Aarch64 only there. PR-1366 is cherry-picked from upstream, and patch-lib/patch-test files are my work but build upon said PR. Also cherry-pick PR1354 that adds a missing check for AVX, which is why I am bumping PORTREVISION because it might change code (I have not checked). AcademySoftwareFoundation/openexr#1365 PR: 270348 Reported by: fuz@ (Robert Clausecker) (cherry picked from commit 54d6860)
…ch64) only (AcademySoftwareFoundation#1366) Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths for ZIP filtering. But that code uses several instructions that are ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7) platforms. Make these optimizations only kick in when building for 64-bit ARM platforms. Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
…ch64) only (AcademySoftwareFoundation#1366) Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths for ZIP filtering. But that code uses several instructions that are ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7) platforms. Make these optimizations only kick in when building for 64-bit ARM platforms. Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
…ch64) only (#1366) Should fix #1365. Recent PR (#1348) added NEON accelerated code paths for ZIP filtering. But that code uses several instructions that are ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7) platforms. Make these optimizations only kick in when building for 64-bit ARM platforms. Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
@aras-p Pull Request #1348 and commit 677c6a5 seem to introduce breaking changes on ARMv7, see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270348
The code assumes that if ARM NEON extensions are available, it can use
vqtbl1q_u8
,vzip1q_u8
andvzip2q_8
.From the little I understand from browsing ARM NEON arch' manuals, this intrinsic appears to only be available on AArch64 but not 32-bit ARMv7.
If that is true: Can we please have guards around the Aarch64 stuff?
If I am misunderstanding, what do you need or suggest to solve this?
The text was updated successfully, but these errors were encountered: