Use cmeq, cmge, cmgt (zero) when one of the operands is Vector64/128<T>.Zero #33972

echesakov · 2020-03-23T17:43:28Z

For example,

The code as in #33749 (comment) BitArray:.ctor

dup     v17.16b, wzr
cmeq    v16.16b, v16.16b, v17.16b

should be optimized down to

cmeq    v16.16b, v16.16b, #0

This applies to all the intrinsics that are mapped to cmeq, cmge, cmgt, cmle, cmlt, fcmeq, fcmge, fcmgt, fcmle and fcmlt instructions

category:cq
theme:hardware-intrinsics
skill-level:intermediate
cost:small

Gnbrkm41 · 2020-03-31T06:49:43Z

runtime/src/libraries/System.Collections/src/System/Collections/BitArray.cs

Lines 183 to 195 in c74407f

    
           // JIT does not support code hoisting for SIMD yet 
        
           // However comparison against zero can be replaced to cmeq against zero (vceqzq_s8) 
        
           // See dotnet/runtime#33972 for details 
        
           Vector128<byte> zero = Vector128<byte>.Zero; 
        
           fixed (bool* ptr = values) 
        
           { 
        
               for (; (i + Vector128ByteCount * 2u) <= (uint)values.Length; i += Vector128ByteCount * 2u) 
        
               { 
        
                   // Same logic as SSE2 path, however we lack MoveMask (equivalent) instruction 
        
                   // As a workaround, mask out the relevant bit after comparison 
        
                   // and combine by ORing all of them together (In this case, adding all of them does the same thing) 
        
                   Vector128<byte> lowerVector = AdvSimd.LoadVector128((byte*)ptr + i); 
        
                   Vector128<byte> lowerIsFalse = AdvSimd.CompareEqual(lowerVector, zero);

echesakov · 2022-03-15T21:48:08Z

I believe this item was addressed fully.
@TIHan can you please confirm and close the issue?

TIHan · 2022-03-15T22:00:17Z

Yes, I believe it was. There is more opportunity with other instructions like 'cmle' and 'cmlt', but based on the title of this issue, we got them covered.

echesakov added arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization labels Mar 23, 2020

Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Mar 23, 2020

echesakov mentioned this issue Mar 23, 2020

Vectorise BitArray for ARM64 #33749

Merged

BruceForstall added this to the Future milestone Apr 4, 2020

BruceForstall removed the untriaged New issue has not been triaged by the area owner label Apr 4, 2020

echesakov mentioned this issue Aug 13, 2020

Get index of first non ascii byte #39506

Merged

CarolEidt modified the milestones: Future, 6.0.0 Oct 13, 2020

echesakov mentioned this issue Oct 20, 2020

[Arm64] Planned JIT work in .NET 6 #43629

Closed

29 tasks

JulieLeeMSFT assigned echesakov Mar 23, 2021

JulieLeeMSFT added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Mar 23, 2021

JulieLeeMSFT removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jun 7, 2021

echesakov modified the milestones: 6.0.0, Future Jul 7, 2021

echesakov added the Priority:3 Work that is nice to have label Jul 7, 2021

echesakov modified the milestones: Future, 7.0.0 Oct 15, 2021

AndyAyersMS assigned TIHan Dec 1, 2021

TIHan mentioned this issue Dec 16, 2021

'cmeq' and 'fcmeq' Vector64<T>.Zero/Vector128<T>.Zero ARM64 containment optimizations #62933

Merged

3 tasks

TIHan mentioned this issue Feb 4, 2022

[JIT] More ARM64 comparison instruction optimizations with Vector.Zero #64783

Merged

1 task

echesakov removed their assignment Mar 15, 2022

TIHan closed this as completed Mar 15, 2022

ghost locked as resolved and limited conversation to collaborators Apr 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cmeq, cmge, cmgt (zero) when one of the operands is Vector64/128<T>.Zero #33972

Use cmeq, cmge, cmgt (zero) when one of the operands is Vector64/128<T>.Zero #33972

echesakov commented Mar 23, 2020 •

edited by BruceForstall

Loading

Gnbrkm41 commented Mar 31, 2020

echesakov commented Mar 15, 2022

TIHan commented Mar 15, 2022

Use cmeq, cmge, cmgt (zero) when one of the operands is Vector64/128<T>.Zero #33972

Use cmeq, cmge, cmgt (zero) when one of the operands is Vector64/128<T>.Zero #33972

Comments

echesakov commented Mar 23, 2020 • edited by BruceForstall Loading

Gnbrkm41 commented Mar 31, 2020

echesakov commented Mar 15, 2022

TIHan commented Mar 15, 2022

echesakov commented Mar 23, 2020 •

edited by BruceForstall

Loading