Default aarch64-min-jump-table-entries to 9 #98391

hiraditya · 2024-07-10T21:30:23Z

The previous experiment in #71166 set to 13 but the results weren't significantly better. Setting to 9

See #98223 for the motivation.

llvmbot · 2024-07-10T21:30:57Z

@llvm/pr-subscribers-backend-aarch64

Author: AdityaK (hiraditya)

Changes

The previous experiment in #71166 set to 13 but the results weren't significantly better. Setting to 9

Full diff: https://github.com/llvm/llvm-project/pull/98391.diff

1 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+1-1)

diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 1fad1d5ca6d7d..47d179712a12c 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -72,7 +72,7 @@ static cl::opt<AArch64PAuth::AuthCheckMethod>
                                cl::values(AUTH_CHECK_METHOD_CL_VALUES_LR));
 
 static cl::opt<unsigned> AArch64MinimumJumpTableEntries(
-    "aarch64-min-jump-table-entries", cl::init(13), cl::Hidden,
+    "aarch64-min-jump-table-entries", cl::init(9), cl::Hidden,
     cl::desc("Set minimum number of entries to use a jump table on AArch64"));
 
 unsigned AArch64Subtarget::getVectorInsertExtractBaseCost() const {

davemgreen · 2024-07-10T23:13:02Z

What do you mean by "weren't significantly better", and do you have a reason to change this to 9?

hiraditya · 2024-07-11T00:37:51Z

What do you mean by "weren't significantly better", and do you have a reason to change this to 9?

Based on the comment here: #71166 (comment) and #71166 (comment)

13 is too large IMO, it results in codesize increase and it seems like #71166 was purely chosen for benchmarking purposes. Is there a way to find a better number here?

The number of branches is of the order of 2*(number of cases), it may appear fine for benchmarking purposes but i dont think this is a good idea to generate so many branches when jump table was sufficient.

hiraditya · 2024-07-11T00:44:30Z

On a related note RISC-V chose to have a threshold of 5 5973272

The previous experiment in llvm#71166 set to 13 but the results werent significantly better. Setting to 9

david-arm · 2024-07-11T08:12:40Z

Have you run this patch several times with perlbench from SPEC2017 to verify the before and after results? At the time I created #71166 I saw an improvement in performance, although I realise perlbench can be noisy. Do you have any SPEC2017 results you can share on this PR? Thanks!

ilinpv · 2024-07-11T12:26:35Z

@hiraditya it might be worth trying a more flexible solution here to balance size and speed, for example set aarch64-min-jump-table-entries another value when optimizing for size with -Os/Oz. It addition it could be set depending on branch profile counters.

david-arm · 2024-07-11T12:57:47Z

Just to clarify, I'm not against changing this value if it's beneficial, since #71166 was a bit of an experiment anyway.

It sounds like from the comments (rather than the commit message) that you are worried about code size? If so, I can certainly understand that point of view. Like @ilinpv suggested I believe we already use a lower value for -Oz and can extend this to -Os if that helps? Or if you believe this patch has no effect on performance at -O3 perhaps we could just change the default at all optimisation levels.

davemgreen · 2024-07-11T13:27:34Z

What do you mean by "weren't significantly better", and do you have a reason to change this to 9?

Based on the comment here: #71166 (comment) and #71166 (comment)

13 is too large IMO, it results in codesize increase and it seems like #71166 was purely chosen for benchmarking purposes. Is there a way to find a better number here?

The number of branches is of the order of 2*(number of cases), it may appear fine for benchmarking purposes but i dont think this is a good idea to generate so many branches when jump table was sufficient.

I'm not particularly married to the current value. I'm not sure if there is a single optimal values, and the one we have is kind of odd, but if we are changing it then I would hope we had a better reason. We used to have a lower value, GCC has always used a values closer to what we have now. The real answer can likely depend on many issues like the predictability of each of the branches or the relative capability of the branch predictor in the cpu.

When I try this patch on the llvm test suite + 3 versions of spec, the geomean performance is 0.1% worse. The codesize doesn't change (at -O3, it might be different at other opt levels). Those results are ran very quickly though, so might be more noisy than we would like, and this patch in general will lead to changes that are often noisy due to the nature of branches being differently aligned. If we ran it a week earlier or a week later the performance might be totally different for it.

hiraditya · 2024-07-11T18:53:04Z

Like @ilinpv suggested I believe we already use a lower value for -Oz and can extend this to -Os if that helps?

As I don't have a way to run SPEC at the moment, i'm fine if we make this change only for -Os. And i'll create a separate issue for adding profile-info to the codegen of switch statements.

hiraditya requested a review from david-arm July 10, 2024 21:30

llvmbot added the backend:AArch64 label Jul 10, 2024

hiraditya requested a review from enh-google July 10, 2024 21:30

hiraditya requested a review from ilinpv July 10, 2024 21:31

Default aarch64-min-jump-table-entries to 9

3df1c67

The previous experiment in llvm#71166 set to 13 but the results werent significantly better. Setting to 9

hiraditya force-pushed the jump branch from 347b126 to 3df1c67 Compare July 11, 2024 05:34

This was referenced Jul 11, 2024

Use profile information to find best codegen strategy for switch statements. #98519

Open

Missed loop replace_if vectorization due to the presence of switch #48188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default aarch64-min-jump-table-entries to 9 #98391

Default aarch64-min-jump-table-entries to 9 #98391

hiraditya commented Jul 10, 2024 •

edited

Loading

llvmbot commented Jul 10, 2024

davemgreen commented Jul 10, 2024

hiraditya commented Jul 11, 2024 •

edited

Loading

hiraditya commented Jul 11, 2024

david-arm commented Jul 11, 2024

ilinpv commented Jul 11, 2024 •

edited

Loading

david-arm commented Jul 11, 2024

davemgreen commented Jul 11, 2024

hiraditya commented Jul 11, 2024 •

edited

Loading

Default aarch64-min-jump-table-entries to 9 #98391

Are you sure you want to change the base?

Default aarch64-min-jump-table-entries to 9 #98391

Conversation

hiraditya commented Jul 10, 2024 • edited Loading

llvmbot commented Jul 10, 2024

davemgreen commented Jul 10, 2024

hiraditya commented Jul 11, 2024 • edited Loading

hiraditya commented Jul 11, 2024

david-arm commented Jul 11, 2024

ilinpv commented Jul 11, 2024 • edited Loading

david-arm commented Jul 11, 2024

davemgreen commented Jul 11, 2024

hiraditya commented Jul 11, 2024 • edited Loading

hiraditya commented Jul 10, 2024 •

edited

Loading

hiraditya commented Jul 11, 2024 •

edited

Loading

ilinpv commented Jul 11, 2024 •

edited

Loading

hiraditya commented Jul 11, 2024 •

edited

Loading