Experiment - Remove space restriction from XarchIntrinsicConstants #102913

khushal1996 · 2024-05-31T10:43:42Z

NO NEED FOR REVIEW

…cHelper.cs

dotnet-policy-service · 2024-05-31T10:44:08Z

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

MichalStrehovsky · 2024-05-31T11:15:12Z

src/coreclr/nativeaot/Runtime/startup.cpp

@@ -50,11 +50,11 @@ extern RhConfig * g_pRhConfig;

 #if defined(HOST_X86) || defined(HOST_AMD64) || defined(HOST_ARM64)
 // This field is inspected from the generated code to determine what intrinsics are available.
-EXTERN_C int g_cpuFeatures;


These are bit-tested at runtime using compiler generated code. This is going to be quite complicated to do without regressing perf on 32bit x86.

I already wrote this elsewhere - I would like to get an understanding why we're wasting bits in this enum - a lot of the bits are redundant, e.g. XArchIntrinsicConstants_Avx2 and XArchIntrinsicConstants_VectorT256. Why do we need this duplication? Is this putting RyuJIT implementation details ("VectorT256" that is not an ISA extension as far as I know) into PAL?

a lot of the bits are redundant, e.g. XArchIntrinsicConstants_Avx2 and XArchIntrinsicConstants_VectorT256

That's 3 bits (VectorT128, VectorT256, VectorT512). I would not call it a lot. Is there anything else?

Why do we need this duplication? Is this putting RyuJIT implementation details ("VectorT256" that is not an ISA extension as far as I know) into PAL?

Right, we do not need the VectorT* bits in the minipal. They got there by sequence of copy&paste operations of code that was in CoreCLR JIT/EE interface originally.

We do need the VectorT* bits on JIT/EE interface (that is different enum). The vector size is modelled as virtual ISA there.

That's 3 bits (VectorT128, VectorT256, VectorT512). I would not call it a lot. Is there anything else?

All of the VL suffixes that seem to just be shortcuts for XArchIntrinsicConstants_Avx512f_vl plus something else that already has a bit.

The question is basically whether we can get away with 32 bits until we're ready to call Sse41 and friends "baseline". The underlying oses are starting to do that on both Windows and Linux side. Or we'll need to do the more complicated thing and extend this to arbitrary lengths like this PR is doing.

a lot of the bits are redundant, e.g. XArchIntrinsicConstants_Avx2 and XArchIntrinsicConstants_VectorT256.

These are not duplicate. The former describes support for Avx2 the latter describes the size of Vector<T>, these do not have to line up and it may be important to understand the encoded size of Vector<T> for ABI purposes.

There are some opportunities to fold together bits. For example, AVX512*_VL could just be Avx512vl. Given that we require AVX512F+BW+CD+DQ+VL to be provided simultaneously, we could combine all of these into one as well.

We could notably also not allow as fine-grained control as this currently is allowing. Rather than allowing individual opt-in to SSE3, SSSE3, SSE4.1, etc, we could force these to be present at the respective target baselines (i.e. x86-64-v2, x86-64-v3, x86-64-v4) and only include the standalone ISAs that are independent of the primary baselines

However, all of that is less control than most other compilers toolchains give and we're likely to run out of bits sooner than we're able to meaningfully compress them and drop support for downlevel ISAs (although I am definitively still supportive of us moving towards an x86-64-v2 baseline). So it's probably goodness that we take a look at this since we're already pushing up against the boundaries already.

These are not duplicate.

They are duplicate/undesirable at PAL level. PAL should be policy free. XArchIntrinsicConstants_... is PAL level enum.

They are not duplicate at the JIT/EE interface level where the Vector policy size policy lives.

All of the VL suffixes that seem to just be shortcuts for XArchIntrinsicConstants_Avx512f_vl plus something else that already has a bit.

I agree that we can use just a single bit for VL in this enum if we needed to free up more bits.

khushal1996 · 2024-06-04T18:19:52Z

@tannergooding @jkotas @MichalStrehovsky
This experiment PR was an effort towards making space for the new ISA added in PR #101938 which was urgent for us at Intel. For now, we did make space by removing unused isa Avx10v1_V256. But I was trying to make more space for ISAs. Looks like VectorT* bits in minipal were removed and we are discussing getting rid of the VL bits as well. I will keep this open for discussion and close it once completed.

khushal1996 · 2024-06-10T04:57:29Z

Closing this discussion.

khushal1996 added 2 commits May 31, 2024 02:22

Remove space restriction from XarchIntrinsicConstatns

cdd0bd4

make structure of HardwareIntrinsicConstants same in HardwareIntrinsi…

1072c31

…cHelper.cs

dotnet-issue-labeler bot added the area-NativeAOT-coreclr label May 31, 2024

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 31, 2024

MichalStrehovsky reviewed May 31, 2024

View reviewed changes

build-analysis bot mentioned this pull request May 31, 2024

Dead lettering tests #101524

Closed

jkotas mentioned this pull request May 31, 2024

Delete VectorT size constants from minipal/cpufeatures #102946

Merged

khushal1996 closed this Jun 10, 2024

github-actions bot locked and limited conversation to collaborators Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment - Remove space restriction from XarchIntrinsicConstants #102913

Experiment - Remove space restriction from XarchIntrinsicConstants #102913

khushal1996 commented May 31, 2024

dotnet-policy-service bot commented May 31, 2024

MichalStrehovsky May 31, 2024

jkotas May 31, 2024

MichalStrehovsky May 31, 2024

tannergooding May 31, 2024 •

edited

Loading

jkotas May 31, 2024

jkotas May 31, 2024

jkotas Jun 1, 2024

khushal1996 commented Jun 4, 2024

khushal1996 commented Jun 10, 2024

Experiment - Remove space restriction from XarchIntrinsicConstants #102913

Experiment - Remove space restriction from XarchIntrinsicConstants #102913

Conversation

khushal1996 commented May 31, 2024

NO NEED FOR REVIEW

dotnet-policy-service bot commented May 31, 2024

MichalStrehovsky May 31, 2024

Choose a reason for hiding this comment

jkotas May 31, 2024

Choose a reason for hiding this comment

MichalStrehovsky May 31, 2024

Choose a reason for hiding this comment

tannergooding May 31, 2024 • edited Loading

Choose a reason for hiding this comment

jkotas May 31, 2024

Choose a reason for hiding this comment

jkotas May 31, 2024

Choose a reason for hiding this comment

jkotas Jun 1, 2024

Choose a reason for hiding this comment

khushal1996 commented Jun 4, 2024

khushal1996 commented Jun 10, 2024

tannergooding May 31, 2024 •

edited

Loading