Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openexr 3.1.6 regression: does not compile on ARM v7 due to unguarded use of unavailable AARCH64 intrinsics #1365

Closed
mandree opened this issue Mar 19, 2023 · 6 comments · Fixed by #1366

Comments

@mandree
Copy link
Contributor

mandree commented Mar 19, 2023

@aras-p Pull Request #1348 and commit 677c6a5 seem to introduce breaking changes on ARMv7, see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270348

The code assumes that if ARM NEON extensions are available, it can use vqtbl1q_u8, vzip1q_u8 and vzip2q_8.
From the little I understand from browsing ARM NEON arch' manuals, this intrinsic appears to only be available on AArch64 but not 32-bit ARMv7.

If that is true: Can we please have guards around the Aarch64 stuff?
If I am misunderstanding, what do you need or suggest to solve this?

@cary-ilm
Copy link
Member

Is this fixed by #1358? If not, better guards are in order.

@mandree mandree changed the title openexr 3.1.6 regression: does not compile on ARM v7 openexr 3.1.6 regression: does not compile on ARM v7 due to unguarded use of unavailable AARCH64 intrinsics Mar 19, 2023
@mandree
Copy link
Contributor Author

mandree commented Mar 19, 2023

@cary-ilm I don't have ARMv7 around, I am relaying a report from a FreeBSD fellow; reading #1358 it looks as though the detection would be helpful, but since that emulates only vst1q... but not vqtbl1q... I don't see how it could possibly help.
Furthermore, this does not appear to be a missing feature of the compiler or header file, but actual unavailability of the ARMv8 Aarch64 NEON instruction on ARMv7 hardware. (I have looked at LLVM/clang 10.0 and 4.0 headers which would define this if on __ARM_NEON and __aarch64__. FreeBSD uses LLVM for its cc/c++.)

@cary-ilm
Copy link
Member

@mandree I don't have an ARM device to test with either, relying on others to do the leg-work. @aras-p, what solution to you recommend? The substitute intrinsic that @betajippity provided in #1358 looks close but there's more missing, right? Thanks for any help in getting this sorted out.

@mandree
Copy link
Contributor Author

mandree commented Mar 19, 2023

My take is that some #ifdef/#define work would be in order that sets ILM_HAVE_NEON_AARCH64 or similar, and makes the vzip and vqtbl stuff depend on that rather than just ILM_HAVE_NEON. If done as bluntly as suggested here, that would make the optimization unavailable on ARMv7.

I have sketched this proposal in two attachments to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270348 - but unless there are newer comments in that bug report, these are untested, so I do not know if they are doing the right thing.

freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this issue Mar 19, 2023
@aras-p
Copy link
Contributor

aras-p commented Mar 20, 2023

Ah I did not realize that anyone would build OpenEXR on armv7. My bad, sorry! I think I'd go for just not using the NEON code path there (i.e. only enable it on armv8/aarch64); that way it stays the same as it was before #1348. I'll try to submit a fix today.

@clausecker
Copy link

@aras-p OpenEXR appears as a transitive dependency in some desktop software. E.g. ImageMagick and windowmaker depend on it.

aras-p added a commit to aras-p/openexr that referenced this issue Mar 20, 2023
…ch64) only

Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths
for ZIP filtering. But that code uses several instructions that are
ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7)
platforms. Make these optimizations only kick in when building
for 64-bit ARM platforms.
aras-p added a commit to aras-p/openexr that referenced this issue Mar 20, 2023
…ch64) only

Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths
for ZIP filtering. But that code uses several instructions that are
ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7)
platforms. Make these optimizations only kick in when building
for 64-bit ARM platforms.

Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
cary-ilm pushed a commit that referenced this issue Mar 20, 2023
…ch64) only (#1366)

Should fix #1365. Recent PR (#1348) added NEON accelerated code paths
for ZIP filtering. But that code uses several instructions that are
ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7)
platforms. Make these optimizations only kick in when building
for 64-bit ARM platforms.

Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
mandree referenced this issue Mar 21, 2023
* Enable fast Huffman decoding on macOS

Enable fast Huffman decoding for macOS (x86 and Apple silicon)

Signed-off-by: Developer Ecosystem Engineering <DeveloperEcosystemEngineering@apple.com>

* Implement Huffman zig-zag transform

Implements Huffman zig-zag transform and 32 to 16 bit floating point

Signed-off-by: Developer Ecosystem Engineering <DeveloperEcosystemEngineering@apple.com>

Signed-off-by: Developer Ecosystem Engineering <DeveloperEcosystemEngineering@apple.com>
vishwin pushed a commit to vishwin/freebsd-ports that referenced this issue Mar 21, 2023
OpenEXR 3.1.6 introduced several NEON-based optimizations
that implied Aarch64. Add patched, either picked from
upstream, or written by mandree@, to enable those
NEON features that also require Aarch64 only there.

PR-1366 is cherry-picked from upstream, and patch-lib/patch-test files
are my work but build upon said PR.

Also cherry-pick PR1354 that adds a missing check for AVX,
which is why I am bumping PORTREVISION because it might change
code (I have not checked).

AcademySoftwareFoundation/openexr#1365

PR:		270348
Reported by:	fuz@ (Robert Clausecker)
freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this issue Mar 21, 2023
OpenEXR 3.1.6 introduced several NEON-based optimizations
that implied Aarch64. Add patched, either picked from
upstream, or written by mandree@, to enable those
NEON features that also require Aarch64 only there.

PR-1366 is cherry-picked from upstream, and patch-lib/patch-test files
are my work but build upon said PR.

Also cherry-pick PR1354 that adds a missing check for AVX,
which is why I am bumping PORTREVISION because it might change
code (I have not checked).

AcademySoftwareFoundation/openexr#1365

PR:		270348
Reported by:	fuz@ (Robert Clausecker)

(cherry picked from commit 54d6860)
cary-ilm pushed a commit to cary-ilm/openexr that referenced this issue Mar 26, 2023
…ch64) only (AcademySoftwareFoundation#1366)

Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths
for ZIP filtering. But that code uses several instructions that are
ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7)
platforms. Make these optimizations only kick in when building
for 64-bit ARM platforms.

Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
cary-ilm pushed a commit to cary-ilm/openexr that referenced this issue Mar 26, 2023
…ch64) only (AcademySoftwareFoundation#1366)

Should fix AcademySoftwareFoundation#1365. Recent PR (AcademySoftwareFoundation#1348) added NEON accelerated code paths
for ZIP filtering. But that code uses several instructions that are
ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7)
platforms. Make these optimizations only kick in when building
for 64-bit ARM platforms.

Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
cary-ilm pushed a commit that referenced this issue Mar 28, 2023
…ch64) only (#1366)

Should fix #1365. Recent PR (#1348) added NEON accelerated code paths
for ZIP filtering. But that code uses several instructions that are
ARMv8 (aarch64) only, and thus fail building on 32-bit ARM (armv7)
platforms. Make these optimizations only kick in when building
for 64-bit ARM platforms.

Signed-off-by: Aras Pranckevicius <aras@nesnausk.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants