-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] OIIO 2.5.7.0 fails to build on Fedora only on aarch64 #4111
Comments
Hi Richard!
I'll give you more feedback on Debian later this week (maybe this evening CET) since I'm uploading v2.5.7.0 to Debian experimental prior to the SONAME transition and it'll be built on all release architectures, so even arm64. |
Here: ARM64 build Hope this helps somehow. |
Unfortuantely I couldn't find anything that helped. For now I have reverted to 2.5.6.0. |
Same issue for openSUSE Tumbleweed https://build.opensuse.org/package/live_build_log/graphics/OpenImageIO/openSUSE_Factory_ARM/aarch64 |
The original build expired so here's a new link showing the error for 2.5.8.0: |
openSUSE package updated to 2.5.8.0. same link as above. |
can we maybe revert the neon part from |
Thanks for your patience. I will look at this issue today. |
Proposed fix here: #4143 Can you try this patch out on your end before I merge it? The problem is that even though we now have an ARM based Mac in our CI suite, it looks like the flavor of Apple Clang it uses was perfectly happy to accept the code we had and just understand that it was fine to bitcast between those types, whereas it was rejected on the ARM Linux variants (which we don't have available to test on). Reading documentation on the intrinsics, I believe I was able to repair it, but won't be sure until you give it a try. The functions get tested, so I know they were correct execution all along, it's just a question of the compiler being strict about the type casting. |
package with the patch is building here: https://build.opensuse.org/package/show/home:darix:branches:graphics/OpenImageIO you can find multiple distributions and architectures there. it seems 2.5.8.0 broke the testsuite for us. even updating the images to the latest version does not help. for the record 2.5.7.0 still passed https://build.opensuse.org/package/show/openSUSE:Factory/OpenImageIO |
If I understand the failing aarch64 logs for the build in progress that you point to above, I believe that it is failing because the Imath/OpenEXR you have installed is 2.2, whereas OpenImageIO 2.5 requires a minimum OpenEXR 2.4. (Current OpenEXR is 3.2, so we're talking about extraordinarily old.) At least in that log, it's not getting anywhere near far enough in the build for the NEON patch to be relevant. It fails just trying to find dependencies. |
I will force newer BuildRequires. |
For reference, OpenEXR 2.4 was released in 2019. |
@hobbes1069 Your error was definitely on the NEON part. Please do let me know (if you can check) whether the patch from this PR gets you unstuck. |
@lgritz yes, seems to be all good now, just built 2.5.7.0. |
Outstanding, thanks @hobbes1069. Would you mind please clicking approve on #4143, since basically fixing your build break is the whole reason for that patch? Thanks. |
@lgritz got any ideas about the new testsuite failures on x86_64 https://build.opensuse.org/package/live_build_log/home:darix:branches:graphics/OpenImageIO/openSUSE_Tumbleweed/x86_64 ? |
@darix Those look suspiciously like the same failures as here https://github.com/AcademySoftwareFoundation/OpenImageIO/actions/runs/7811654989/job/21307148511 Are you by any chance using an exr that's either taken right from their master, or incorporates that broken patch from the top of their master? |
Yes, that openexr-CVE-2023-5841.patch is the one with the bug, as far as I can tell. |
oki. will keep that in mind. |
Sigh. CVEs. As I understand it, this CVE was addressing a potential memory clobber in a utility that is only used for OpenEXR's own testing. It not an exploitable problem in any application that uses openexr because it's not in code that any real app would call. But because everybody drops everything when the magic word "CVE" is uttered, lest an unaddressed CVE be a stain on the project and jeopardize its inclusion in distros, the "fixes" are rushed out without proper care or time to test. This is yet another instance of a CVE that reports something that isn't really a problem and the rush to fix caused a much worse one. (The situation was also compounded by an unrelated logistical problem, wherein the openexr team did not receive the email that reported the bug and only found out about the problem when the CVE was published, which was done without anybody checking in with the project authors to verify that we received the email or ask why we hadn't responded.) |
I brought up the state of the openexr patch to the suse security team so we can remedy the situation. :) now that we have a fix for OpenImageIO ... it is back to the coal mines and trying to hammer the cmake mess for MaterialX/OpenUSD into shape that one can finally install it properly. Thank you for your support and keep up the great work! :) |
So sorry about the openexr issue. We are working hard to try to fix it ASAP. We held up our release while we work it out, not realizing that the bad commit had been cherry picked into your patches. |
honestly compared to MaterialX/OpenUSD this is harmless :) |
all architectures are passing now for me |
Fixed via #4143. |
Primarily, recent changes (PR #4071) to vint4::store for the NEON case appear to have some type mismatches, which apple clang on ARM-based Mac (including our CI) seems ok with, but which is generating type errors on other ARM Linux platforms. I think the types were weird here, so I tightened it up to get the types right for temporary variables in that function. That's the primary fix here. Secondarily, I modified simd.h and the CMake setup so that build option USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I realized that USE_SIMD=0 was not disabling NEON, so there was no way for a NEON platform to completely disable SIMD if they needed to.) Fixes #4111 Signed-off-by: Larry Gritz <lg@larrygritz.com>
Primarily, recent changes (PR AcademySoftwareFoundation#4071) to vint4::store for the NEON case appear to have some type mismatches, which apple clang on ARM-based Mac (including our CI) seems ok with, but which is generating type errors on other ARM Linux platforms. I think the types were weird here, so I tightened it up to get the types right for temporary variables in that function. That's the primary fix here. Secondarily, I modified simd.h and the CMake setup so that build option USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I realized that USE_SIMD=0 was not disabling NEON, so there was no way for a NEON platform to completely disable SIMD if they needed to.) Fixes AcademySoftwareFoundation#4111 Signed-off-by: Larry Gritz <lg@larrygritz.com>
Primarily, recent changes (PR AcademySoftwareFoundation#4071) to vint4::store for the NEON case appear to have some type mismatches, which apple clang on ARM-based Mac (including our CI) seems ok with, but which is generating type errors on other ARM Linux platforms. I think the types were weird here, so I tightened it up to get the types right for temporary variables in that function. That's the primary fix here. Secondarily, I modified simd.h and the CMake setup so that build option USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I realized that USE_SIMD=0 was not disabling NEON, so there was no way for a NEON platform to completely disable SIMD if they needed to.) Fixes AcademySoftwareFoundation#4111 Signed-off-by: Larry Gritz <lg@larrygritz.com> Signed-off-by: Peter Kovář <peter.kovar@reflexion.tv>
Describe the bug
Full log:
https://kojipkgs.fedoraproject.org//work/tasks/1777/111791777/build.log
Perhaps this is do to the new SIMD updates? I never built 2.5.6.0 in the official repos but all test builds completed.
Platform information:
The text was updated successfully, but these errors were encountered: