Skip to content

Commit

Permalink
fix(simd.h): Address NEON issues (AcademySoftwareFoundation#4143)
Browse files Browse the repository at this point in the history
Primarily, recent changes (PR AcademySoftwareFoundation#4071) to vint4::store for the NEON case
appear to have some type mismatches, which apple clang on ARM-based Mac
(including our CI) seems ok with, but which is generating type errors on
other ARM Linux platforms.

I think the types were weird here, so I tightened it up to get the types
right for temporary variables in that function. That's the primary fix
here.

Secondarily, I modified simd.h and the CMake setup so that build option
USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I
realized that USE_SIMD=0 was not disabling NEON, so there was no way for
a NEON platform to completely disable SIMD if they needed to.)

Fixes AcademySoftwareFoundation#4111

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Peter Kovář <peter.kovar@reflexion.tv>
  • Loading branch information
lgritz authored and 1div0 committed Feb 24, 2024
1 parent 377f467 commit 35b2f1c
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 4 deletions.
2 changes: 1 addition & 1 deletion src/cmake/compiler.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@ set (SIMD_COMPILE_FLAGS "")
if (NOT USE_SIMD STREQUAL "")
message (STATUS "Compiling with SIMD level ${USE_SIMD}")
if (USE_SIMD STREQUAL "0")
set (SIMD_COMPILE_FLAGS ${SIMD_COMPILE_FLAGS} "-DOIIO_NO_SSE=1")
set (SIMD_COMPILE_FLAGS ${SIMD_COMPILE_FLAGS} "-DOIIO_NO_SIMD=1")
else ()
string (REPLACE "," ";" SIMD_FEATURE_LIST ${USE_SIMD})
foreach (feature ${SIMD_FEATURE_LIST})
Expand Down
13 changes: 10 additions & 3 deletions src/include/OpenImageIO/simd.h
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,13 @@
// OIIO_SIMD_HAS_SIMD8 : nonzero if vfloat8, vint8, vbool8 are defined
// OIIO_SIMD_HAS_SIMD16 : nonzero if vfloat16, vint16, vbool16 are defined

#ifdef OIIO_NO_SIMD /* Request to disable all SIMD */
# define OIIO_NO_SSE 1
# define OIIO_NO_AVX 1
# define OIIO_NO_AVX2 1
# define OIIO_NO_NEON 1
#endif

#if defined(_M_ARM64) || defined(__aarch64__) || defined(__aarch64)
# ifndef __ARM_NEON__
# define __ARM_NEON__
Expand Down Expand Up @@ -4788,9 +4795,9 @@ OIIO_FORCEINLINE void vint4::store (unsigned char *values) const {
_mm_store_ss((float*)values, _mm_castsi128_ps(val8));
#elif OIIO_SIMD_NEON
vint4 clamped = m_simd & vint4(0xff);
simd_t val16 = vcombine_s16(vqmovn_s32(clamped), vdup_n_s16(0));
simd_t val8 = vcombine_u8(vqmovun_s16(val16), vdup_n_u8(0));
vst1q_lane_u32((uint32_t*)values, val8, 0);
int16x8_t val16 = vcombine_s16(vqmovn_s32(clamped), vdup_n_s16(0));
uint8x16_t val8 = vcombine_u8(vqmovun_s16(val16), vdup_n_u8(0));
vst1q_lane_u32((uint32_t*)values, vreinterpretq_u32_u8(val8), 0);
#else
SIMD_DO (values[i] = m_val[i]);
#endif
Expand Down

0 comments on commit 35b2f1c

Please sign in to comment.