[AArch64] Consider histcnt smaller than i32 in the cost model #108521

SamTebbs33 · 2024-09-13T09:29:05Z

This PR updates the AArch64 cost model to consider the cheaper cost of <i32 histograms to reflect the improvements from
#101017 and #103037

Work by Max Beck-Jones (@DevM-uk)

@DevM-uk

This PR updates the AArch64 cost model to consider the cheaper cost of <i32 histograms to reflect the improvements from llvm#101017 and llvm#103037 Work by Max Beck-Jones (@DevM-uk)

llvmbot · 2024-09-13T09:29:35Z

@llvm/pr-subscribers-llvm-analysis

Author: Sam Tebbs (SamTebbs33)

Changes

This PR updates the AArch64 cost model to consider the cheaper cost of <i32 histograms to reflect the improvements from
#101017 and #103037

Work by Max Beck-Jones (@DevM-uk)

Full diff: https://github.com/llvm/llvm-project/pull/108521.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+17-11)
(modified) llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll (+9-9)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 58c267f1ce4bd6..83b5344fc8ed24 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -517,25 +517,31 @@ static bool isUnpackedVectorVT(EVT VecVT) {
 static InstructionCost getHistogramCost(const IntrinsicCostAttributes &ICA) {
   Type *BucketPtrsTy = ICA.getArgTypes()[0]; // Type of vector of pointers
   Type *EltTy = ICA.getArgTypes()[1];        // Type of bucket elements
+  unsigned TotalHistCnts = 1;
 
-  // Only allow (32b and 64b) integers or pointers for now...
+  // Only allow (up to 64b) integers or pointers
   if ((!EltTy->isIntegerTy() && !EltTy->isPointerTy()) ||
-      (EltTy->getScalarSizeInBits() != 32 &&
-       EltTy->getScalarSizeInBits() != 64))
+      EltTy->getScalarSizeInBits() > 64)
     return InstructionCost::getInvalid();
 
-  // FIXME: Hacky check for legal vector types. We can promote smaller types
-  //        but we cannot legalize vectors via splitting for histcnt.
   // FIXME: We should be able to generate histcnt for fixed-length vectors
   //        using ptrue with a specific VL.
-  if (VectorType *VTy = dyn_cast<VectorType>(BucketPtrsTy))
-    if ((VTy->getElementCount().getKnownMinValue() != 2 &&
-         VTy->getElementCount().getKnownMinValue() != 4) ||
-        VTy->getPrimitiveSizeInBits().getKnownMinValue() > 128 ||
-        !VTy->isScalableTy())
+  if (VectorType *VTy = dyn_cast<VectorType>(BucketPtrsTy)) {
+    unsigned EC = VTy->getElementCount().getKnownMinValue();
+    if (!isPowerOf2_64(EC) || !VTy->isScalableTy())
       return InstructionCost::getInvalid();
 
-  return InstructionCost(BaseHistCntCost);
+    bool Element64b = EltTy->isIntegerTy(64);
+
+    if (EC == 2 || (!Element64b && EC == 4))
+      return InstructionCost(BaseHistCntCost);
+
+    unsigned NaturalVectorWidth = Element64b ? AArch64::SVEBitsPerBlock / 64
+                                             : AArch64::SVEBitsPerBlock / 32;
+    TotalHistCnts = EC / NaturalVectorWidth;
+  }
+
+  return InstructionCost(BaseHistCntCost * TotalHistCnts);
 }
 
 InstructionCost
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll b/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
index aede9c89843128..1ecd02e5c124a6 100644
--- a/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
@@ -971,26 +971,26 @@ define void @histogram_nxv4i32(<vscale x 4 x ptr> %buckets, <vscale x 4 x i1> %m
   ret void
 }
 
-define void @histogram_nxv8i16(<vscale x 8 x ptr> %buckets, <vscale x 8 x i1> %mask) {
+define void @histogram_nxv8i16(<vscale x 8 x ptr> %buckets, <vscale x 8 x i1> %mask) #3 {
 ; CHECK-LABEL: 'histogram_nxv8i16'
-; CHECK-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPE_BASED_ONLY-LABEL: 'histogram_nxv8i16'
-; TYPE_BASED_ONLY-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
+; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
 ; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
   ret void
 }
 
-define void @histogram_nxv16i8(<vscale x 16 x ptr> %buckets, <vscale x 16 x i1> %mask) {
+define void @histogram_nxv16i8(<vscale x 16 x ptr> %buckets, <vscale x 16 x i1> %mask) #3 {
 ; CHECK-LABEL: 'histogram_nxv16i8'
-; CHECK-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPE_BASED_ONLY-LABEL: 'histogram_nxv16i8'
-; TYPE_BASED_ONLY-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
+; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
 ; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vector.histogram.add.nxv16p0.i64(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
@@ -1049,13 +1049,13 @@ define void @histogram_v16i8(<16 x ptr> %buckets, <16 x i1> %mask) {
   ret void
 }
 
-define void @histogram_nxv4i64(<vscale x 4 x ptr> %buckets, <vscale x 4 x i1> %mask) {
+define void @histogram_nxv4i64(<vscale x 4 x ptr> %buckets, <vscale x 4 x i1> %mask) #3 {
 ; CHECK-LABEL: 'histogram_nxv4i64'
-; CHECK-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPE_BASED_ONLY-LABEL: 'histogram_nxv4i64'
-; TYPE_BASED_ONLY-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
+; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
 ; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)

llvmbot · 2024-09-13T09:29:35Z

@llvm/pr-subscribers-backend-aarch64

Author: Sam Tebbs (SamTebbs33)

Changes

This PR updates the AArch64 cost model to consider the cheaper cost of <i32 histograms to reflect the improvements from
#101017 and #103037

Work by Max Beck-Jones (@DevM-uk)

Full diff: https://github.com/llvm/llvm-project/pull/108521.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+17-11)
(modified) llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll (+9-9)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 58c267f1ce4bd6..83b5344fc8ed24 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -517,25 +517,31 @@ static bool isUnpackedVectorVT(EVT VecVT) {
 static InstructionCost getHistogramCost(const IntrinsicCostAttributes &ICA) {
   Type *BucketPtrsTy = ICA.getArgTypes()[0]; // Type of vector of pointers
   Type *EltTy = ICA.getArgTypes()[1];        // Type of bucket elements
+  unsigned TotalHistCnts = 1;
 
-  // Only allow (32b and 64b) integers or pointers for now...
+  // Only allow (up to 64b) integers or pointers
   if ((!EltTy->isIntegerTy() && !EltTy->isPointerTy()) ||
-      (EltTy->getScalarSizeInBits() != 32 &&
-       EltTy->getScalarSizeInBits() != 64))
+      EltTy->getScalarSizeInBits() > 64)
     return InstructionCost::getInvalid();
 
-  // FIXME: Hacky check for legal vector types. We can promote smaller types
-  //        but we cannot legalize vectors via splitting for histcnt.
   // FIXME: We should be able to generate histcnt for fixed-length vectors
   //        using ptrue with a specific VL.
-  if (VectorType *VTy = dyn_cast<VectorType>(BucketPtrsTy))
-    if ((VTy->getElementCount().getKnownMinValue() != 2 &&
-         VTy->getElementCount().getKnownMinValue() != 4) ||
-        VTy->getPrimitiveSizeInBits().getKnownMinValue() > 128 ||
-        !VTy->isScalableTy())
+  if (VectorType *VTy = dyn_cast<VectorType>(BucketPtrsTy)) {
+    unsigned EC = VTy->getElementCount().getKnownMinValue();
+    if (!isPowerOf2_64(EC) || !VTy->isScalableTy())
       return InstructionCost::getInvalid();
 
-  return InstructionCost(BaseHistCntCost);
+    bool Element64b = EltTy->isIntegerTy(64);
+
+    if (EC == 2 || (!Element64b && EC == 4))
+      return InstructionCost(BaseHistCntCost);
+
+    unsigned NaturalVectorWidth = Element64b ? AArch64::SVEBitsPerBlock / 64
+                                             : AArch64::SVEBitsPerBlock / 32;
+    TotalHistCnts = EC / NaturalVectorWidth;
+  }
+
+  return InstructionCost(BaseHistCntCost * TotalHistCnts);
 }
 
 InstructionCost
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll b/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
index aede9c89843128..1ecd02e5c124a6 100644
--- a/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
@@ -971,26 +971,26 @@ define void @histogram_nxv4i32(<vscale x 4 x ptr> %buckets, <vscale x 4 x i1> %m
   ret void
 }
 
-define void @histogram_nxv8i16(<vscale x 8 x ptr> %buckets, <vscale x 8 x i1> %mask) {
+define void @histogram_nxv8i16(<vscale x 8 x ptr> %buckets, <vscale x 8 x i1> %mask) #3 {
 ; CHECK-LABEL: 'histogram_nxv8i16'
-; CHECK-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPE_BASED_ONLY-LABEL: 'histogram_nxv8i16'
-; TYPE_BASED_ONLY-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
+; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
 ; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vector.histogram.add.nxv8p0.i16(<vscale x 8 x ptr> %buckets, i16 1, <vscale x 8 x i1> %mask)
   ret void
 }
 
-define void @histogram_nxv16i8(<vscale x 16 x ptr> %buckets, <vscale x 16 x i1> %mask) {
+define void @histogram_nxv16i8(<vscale x 16 x ptr> %buckets, <vscale x 16 x i1> %mask) #3 {
 ; CHECK-LABEL: 'histogram_nxv16i8'
-; CHECK-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPE_BASED_ONLY-LABEL: 'histogram_nxv16i8'
-; TYPE_BASED_ONLY-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
+; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vector.histogram.add.nxv16p0.i8(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
 ; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vector.histogram.add.nxv16p0.i64(<vscale x 16 x ptr> %buckets, i8 1, <vscale x 16 x i1> %mask)
@@ -1049,13 +1049,13 @@ define void @histogram_v16i8(<16 x ptr> %buckets, <16 x i1> %mask) {
   ret void
 }
 
-define void @histogram_nxv4i64(<vscale x 4 x ptr> %buckets, <vscale x 4 x i1> %mask) {
+define void @histogram_nxv4i64(<vscale x 4 x ptr> %buckets, <vscale x 4 x i1> %mask) #3 {
 ; CHECK-LABEL: 'histogram_nxv4i64'
-; CHECK-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPE_BASED_ONLY-LABEL: 'histogram_nxv4i64'
-; TYPE_BASED_ONLY-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
+; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)
 ; TYPE_BASED_ONLY-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vector.histogram.add.nxv4p0.i64(<vscale x 4 x ptr> %buckets, i64 1, <vscale x 4 x i1> %mask)

huntergr-arm · 2024-09-16T14:10:48Z

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp


-  // Only allow (32b and 64b) integers or pointers for now...
+  // Only allow (up to 64b) integers or pointers


I'd suggest adding a variable unsigned EltSize = EltTy->getScalarSizeInBits(); and using it both in the initial size check and the total size check below.

huntergr-arm · 2024-09-16T14:14:49Z

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

      return InstructionCost::getInvalid();

-  return InstructionCost(BaseHistCntCost);
+    bool Element64b = EltTy->isIntegerTy(64);


Using the EltSize suggestion from above:

Suggested change

bool Element64b = EltTy->isIntegerTy(64);

// HistCnt only supports 32b and 64b element types.

unsigned LegalEltSize = EltSize <= 32 ? 32 : 64;

if (EC == 2 || (LegalEltSize == 32 && EC == 4))

return InstructionCost(BaseHistCntCost);

unsigned NaturalVectorWidth = AArch64::SVEBitsPerBlock / LegalEltSize;

github-actions · 2024-09-17T13:30:56Z

✅ With the latest revision this PR passed the C/C++ code formatter.

huntergr-arm

LGTM, though you'll need to fix the formatting :)

AaronBallman · 2024-09-19T13:33:18Z

I think this caused a new build failure:

FAILED: lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o 
ccache /home/docker/llvm-external-buildbots/clang.17.0.6/bin/clang++ --gcc-toolchain=/gcc-toolchain/usr -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib/Target/AArch64 -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64 -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o -MF lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o.d -o lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:21: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses]
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ^             ~~
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:21: note: add parentheses after the '!' to evaluate the comparison first
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ^
      |                      (                 )
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:21: note: add parentheses around left hand side expression to silence this warning
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ^
      |                     (            )
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:35: error: result of comparison of constant 32 with expression of type 'bool' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ~~~~~~~~~~~~~ ^  ~~
2 errors generated.

https://lab.llvm.org/buildbot/#/builders/145/builds/1925

SamTebbs33 · 2024-09-19T13:41:03Z

I think this caused a new build failure:

FAILED: lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o 
ccache /home/docker/llvm-external-buildbots/clang.17.0.6/bin/clang++ --gcc-toolchain=/gcc-toolchain/usr -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/lib/Target/AArch64 -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64 -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o -MF lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o.d -o lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64TargetTransformInfo.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:21: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses]
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ^             ~~
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:21: note: add parentheses after the '!' to evaluate the comparison first
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ^
      |                      (                 )
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:21: note: add parentheses around left hand side expression to silence this warning
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ^
      |                     (            )
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:537:35: error: result of comparison of constant 32 with expression of type 'bool' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
  537 |     if (EC == 2 || (!LegalEltSize == 32 && EC == 4))
      |                     ~~~~~~~~~~~~~ ^  ~~
2 errors generated.

https://lab.llvm.org/buildbot/#/builders/145/builds/1925

Thanks Aaron, I'll commit a fix.

This fixes a build failure caused by #108521

@DevM-uk

…08521) This PR updates the AArch64 cost model to consider the cheaper cost of <i32 histograms to reflect the improvements from llvm#101017 and llvm#103037 Work by Max Beck-Jones (@DevM-uk) --------- Co-authored-by: DevM-uk <max.beck-jones@arm.com>

This fixes a build failure caused by llvm#108521

[AArch64] Consider histcnt smaller than i32 in the cost model

6ca9685

This PR updates the AArch64 cost model to consider the cheaper cost of <i32 histograms to reflect the improvements from llvm#101017 and llvm#103037 Work by Max Beck-Jones (@DevM-uk)

SamTebbs33 requested review from MacDue, DevM-uk, huntergr-arm, sdesmalen-arm, kmclaughlin-arm and NickGuy-Arm September 13, 2024 09:29

llvmbot added backend:AArch64 llvm:analysis labels Sep 13, 2024

huntergr-arm reviewed Sep 16, 2024

View reviewed changes

Add EltSize

282b196

huntergr-arm approved these changes Sep 18, 2024

View reviewed changes

Format

6b1d6b2

SamTebbs33 merged commit b49a6b2 into llvm:main Sep 19, 2024
8 checks passed

SamTebbs33 deleted the histcnt-costmodel branch September 19, 2024 12:56

SamTebbs33 added a commit that referenced this pull request Sep 19, 2024

[AArch64] Fix build error from extra !

b1b436c

This fixes a build failure caused by #108521

tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024

[AArch64] Fix build error from extra !

f5fa16c

This fixes a build failure caused by llvm#108521

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Consider histcnt smaller than i32 in the cost model #108521

[AArch64] Consider histcnt smaller than i32 in the cost model #108521

SamTebbs33 commented Sep 13, 2024

llvmbot commented Sep 13, 2024

llvmbot commented Sep 13, 2024

huntergr-arm Sep 16, 2024

SamTebbs33 Sep 17, 2024

huntergr-arm Sep 16, 2024

github-actions bot commented Sep 17, 2024 •

edited

Loading

huntergr-arm left a comment

AaronBallman commented Sep 19, 2024

SamTebbs33 commented Sep 19, 2024


		// Only allow (32b and 64b) integers or pointers for now...
		// Only allow (up to 64b) integers or pointers

-    bool Element64b = EltTy->isIntegerTy(64);
+    // HistCnt only supports 32b and 64b element types.
+    unsigned LegalEltSize = EltSize <= 32 ? 32 : 64;
+    if (EC == 2 || (LegalEltSize == 32 && EC == 4))
+      return InstructionCost(BaseHistCntCost);
+    unsigned NaturalVectorWidth = AArch64::SVEBitsPerBlock / LegalEltSize;

[AArch64] Consider histcnt smaller than i32 in the cost model #108521

[AArch64] Consider histcnt smaller than i32 in the cost model #108521

Conversation

SamTebbs33 commented Sep 13, 2024

llvmbot commented Sep 13, 2024

llvmbot commented Sep 13, 2024

huntergr-arm Sep 16, 2024

Choose a reason for hiding this comment

SamTebbs33 Sep 17, 2024

Choose a reason for hiding this comment

huntergr-arm Sep 16, 2024

Choose a reason for hiding this comment

github-actions bot commented Sep 17, 2024 • edited Loading

huntergr-arm left a comment

Choose a reason for hiding this comment

AaronBallman commented Sep 19, 2024

SamTebbs33 commented Sep 19, 2024

github-actions bot commented Sep 17, 2024 •

edited

Loading