Skip to content

Releases: ARM-software/optimized-routines

v24.05 release

23 May 11:25
Compare
Choose a tag to compare
  • Math routine changes
    • Fixed AdvSIMD vector powf and log for the big-endian target.
    • Fixed an undefined signed shift in the exp10 code, unlikely
      to cause problems in practice.
    • AdvSIMD pow got minor optimizations.
    • Now there is a build option to disable SIMD and exp10 tests
      to allow testing libcs without those symbols.
  • pl/ directory
    • Several big-endian fixes and code cleanups.
    • This continues to host many math routines with mixed quality.

v24.01 release

12 Jan 13:18
Compare
Choose a tag to compare
  • String routine changes
    • Added memcpy, memmove, memset for MOPS extension.
    • Optimized memcpy by improving code alignment.
    • Fixed GNU property note on ILP32.
  • Math routine changes
    • Vector math code now uses ACLE intrinsics and aarch64 only.
    • Vector math code no longer builds scalar and base PCS variants.
    • Optimized vector sin and cos.
    • Added tgamma128, a binary128 tgammal implementation.
  • pl/ directory
    • This continues to host many math routines with mixed quality.

v23.01 release

25 Jan 12:34
Compare
Choose a tag to compare
  • Project changes
    • All files are under a new dual license now (MIT OR Apache-2.0 WITH LLVM-exception at the election of the user).
    • Added MAINTAINERS file describing who maintains the subdirectories.
    • Added README.contributors files documenting contribution requirements.
    • Added new pl/ subdirectory for Arm's Performance Library related routines.
  • String routine changes
    • Added memset benchmark.
    • Improved strlen and memcpy benchmarks.
    • Added SVE memcpy.
    • Updated arm string functions to support M-profile PACBTI.
    • Merged the MTE and generic versions of strcmp, strncmp, strcpy and stpcpy into one implementation.
    • Optimized memcmp, memchr-mte, memrchr, strchr-mte, strchrnul-mte, strrchr-mte, strlen, strlen-mte, strnlen, strcpy.
  • Math routine changes
    • Fixed constants in sinf, cosf and sincosf to be compile time computed even with gcc-12 -frounding-math.
    • Fixed an invalid shift in logf.
    • Support floating-point exceptions in vector math routines when WANT_SIMD_EXCEPT is set.

v21.02 release

18 Feb 14:31
Compare
Choose a tag to compare
  • String routine changes
    • Added AArch64 ILP32 ABI support.
    • Fixed SVE strnlen return value.
    • Added MTE related __mtag_tag_region.
    • Added MTE related __mtag_tag_zero_region.
    • Minor code cleanups.

v20.11 release

16 Nov 13:20
Compare
Choose a tag to compare
  • New math routines
    • Scalar erff and erf using fma.

v20.08 release

14 Aug 12:49
Compare
Choose a tag to compare
  • Bug fixes
    • strcmp-mte nul check
    • strncmp-mte with large size
    • arm memcpy with large size (CVE-2020-6096)
  • String routines performance improvements
    • strlen
    • memmove with backward copy
  • Benchmarking code for strings and memory routines
    • strlen

v20.05 release

29 May 13:28
Compare
Choose a tag to compare
  • New functionality (64-bit Arm)
    • string: Optimized MTE variants of strlen, strnlen, strchr, strchrnul, strrchr, memchr, memrchr, strcpy, stpcpy, strcmp, strncmp
    • string: Changes to support BTI
    • string: New optimized memrchr, strnlen
  • Performance improvements (Neoverse N1)
    • strchr/strchrnul: 21% improvement on long strings
    • strrchr: 11% improvement
    • strnlen: 130% improvement on long strings, 50% on short strings
  • Benchmark and tests
    • string: New memcpy benchmark
    • string: Cleanup testsuite and improve test coverage

v20.02 release

28 Feb 14:34
Compare
Choose a tag to compare

New functionality

  • string: New strrchr and stpcpy routines
  • string: New Memory Tagging Extension (MTE) variants of strlen and strchr
  • math: New vector version of pow(double)
  • networking: Optimized ones' complement checksum for 32-bit and 64-bit Arm

Performance improvements

  • string: Improved memcpy and memmove (SIMD and non-SIMD) for 64-bit Arm
  • string: Improved memset for 64-bit Arm