Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64/SVE: Implemented ReciprocalEstimate, ReciprocalExponent, ReciprocalSqrtEstimate, ReciprocalSqrtStep, and ReciprocalStep #103673

Merged
merged 32 commits into from
Jun 21, 2024

Conversation

ebepho
Copy link
Contributor

@ebepho ebepho commented Jun 18, 2024

Contributes to #99957

Stress test output without ReciprocalExponent:

Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalEstimate_float() : 16
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalEstimate_double() : 16
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtEstimate_float() : 16
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtEstimate_double() : 16
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtStep_float() : 7
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtStep_double() : 7
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalStep_float() : 7
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalStep_double() : 7
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalEstimate_float() : 16
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalEstimate_double() : 16
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtEstimate_float() : 16
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtEstimate_double() : 16
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtStep_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalSqrtStep_double() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalStep_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalStep_double() : 7
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

ReciprocalExponent Stress test output:

Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalExponent_float() : 16
Passed test: _Sve_r::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalExponent_double() : 16
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalExponent_float() : 16
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_ReciprocalExponent_double() : 16
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

cc @dotnet/arm64-contrib

Edit: issue resolved
Note: Currently Blocked on ReciprocalExponent for testing. ReciprocalExponent uses the FRECPX instruction. When running the tests, the values I'm generating using the functions in Helper.cs are way off from the expected results.

My current implementation in Helpers.cs is: public static double RecipExpo(double op1) => Math.Pow(op1, -1);

Here's some example error messages:

Sve.ReciprocalExponent<Single>(Vector<Single>): RunBasicScenario_UnsafeRead failed:
firstOp: (0.2654442, 0.6685686, 0.86405885, 0.2987031)
  result: (8, 4, 4, 8)
Sve.ReciprocalExponent<Single>(Vector<Single>): RunBasicScenario_Load failed:
firstOp: (0.5169994, 0.98310274, 0.14602184, 0.06981233)
  result: (4, 4, 16, 32)

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

1 similar comment
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

("VecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "ReciprocalStep_Vector128_Single", ["Isa"] = "AdvSimd", ["LoadIsa"] = "AdvSimd", ["Method"] = "ReciprocalStep", ["RetVectorType"] = "Vector128", ["RetBaseType"] = "Single", ["Op1VectorType"] = "Vector128", ["Op1BaseType"] = "Single", ["Op2VectorType"] = "Vector128", ["Op2BaseType"] = "Single", ["LargestVectorSize"] = "16", ["NextValueOp1"] = "TestLibrary.Generator.GetSingle()", ["NextValueOp2"] = "TestLibrary.Generator.GetSingle()", ["ValidateIterResult"] = "BitConverter.SingleToInt32Bits(Helpers.FPRecipStepFused(left[i], right[i])) != BitConverter.SingleToInt32Bits(result[i])"}),
("VecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "ReciprocalSquareRootStep_Vector64_Single", ["Isa"] = "AdvSimd", ["LoadIsa"] = "AdvSimd", ["Method"] = "ReciprocalSquareRootStep", ["RetVectorType"] = "Vector64", ["RetBaseType"] = "Single", ["Op1VectorType"] = "Vector64", ["Op1BaseType"] = "Single", ["Op2VectorType"] = "Vector64", ["Op2BaseType"] = "Single", ["LargestVectorSize"] = "8", ["NextValueOp1"] = "TestLibrary.Generator.GetSingle()", ["NextValueOp2"] = "TestLibrary.Generator.GetSingle()", ["ValidateIterResult"] = "BitConverter.SingleToInt32Bits(Helpers.FPReciprocalSqrtStepFused(left[i], right[i])) != BitConverter.SingleToInt32Bits(result[i])"}),
("VecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "ReciprocalSquareRootStep_Vector128_Single", ["Isa"] = "AdvSimd", ["LoadIsa"] = "AdvSimd", ["Method"] = "ReciprocalSquareRootStep", ["RetVectorType"] = "Vector128", ["RetBaseType"] = "Single", ["Op1VectorType"] = "Vector128", ["Op1BaseType"] = "Single", ["Op2VectorType"] = "Vector128", ["Op2BaseType"] = "Single", ["LargestVectorSize"] = "16", ["NextValueOp1"] = "TestLibrary.Generator.GetSingle()", ["NextValueOp2"] = "TestLibrary.Generator.GetSingle()", ["ValidateIterResult"] = "BitConverter.SingleToInt32Bits(Helpers.FPReciprocalSqrtStepFused(left[i], right[i])) != BitConverter.SingleToInt32Bits(result[i])"}),
("VecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "ReciprocalStep_Vector64_Single", ["Isa"] = "AdvSimd", ["LoadIsa"] = "AdvSimd", ["Method"] = "ReciprocalStep", ["RetVectorType"] = "Vector64", ["RetBaseType"] = "Single", ["Op1VectorType"] = "Vector64", ["Op1BaseType"] = "Single", ["Op2VectorType"] = "Vector64", ["Op2BaseType"] = "Single", ["LargestVectorSize"] = "8", ["NextValueOp1"] = "TestLibrary.Generator.GetSingle()", ["NextValueOp2"] = "TestLibrary.Generator.GetSingle()", ["ValidateIterResult"] = "BitConverter.SingleToInt32Bits(Helpers.FPReciprocalSqrtStepFused(left[i], right[i])) != BitConverter.SingleToInt32Bits(result[i])"}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kunalspathak kunalspathak self-requested a review June 20, 2024 04:43
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the AdvSimd test

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 tests are still failing.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak
Copy link
Member

/ba-g timeout issues

@kunalspathak kunalspathak merged commit 2b5c542 into dotnet:main Jun 21, 2024
155 of 167 checks passed
rzikm pushed a commit to rzikm/dotnet-runtime that referenced this pull request Jun 24, 2024
…eciprocalSqrtEstimate`, `ReciprocalSqrtStep`, and `ReciprocalStep` (dotnet#103673)

* Added ConverToInt32 and ConvertToUInt32 for float inputs.

* Added flags to handle only low predicate registers.

* Fix whitespace

* Remove special codegen flag

* Added new test template for operations with different return types.

* Add new test template.

* Added api for ConvertToInt32 and ConvertToUInt 32 for double.

* Finished 4/5 Round intrinsics for sve fp.

* Completed ReciprocalExponent SVE api.

* Revert "Finished 4/5 Round intrinsics for sve fp."

This reverts commit aab2067.

* Reciprocal SVE apis - note: ReciprocalExponent still has issues.

* Completed ReciprocalExponent SVE api.

* fixed naming.

* Fixed some naming.

* fixed more affect function names.

* Fixed some more naming and retested. All stress tests pass.

* fixed more helper function naming.
@github-actions github-actions bot locked and limited conversation to collaborators Jul 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants