Allow dcurl to compile and execute AVX code implementation on the hardware not supporting AVX2 #83

marktwtn · 2018-11-13T09:26:20Z

The BUILD_AVX=1 command option would enable AVX code implementation of dcurl.
However, the current AVX code has two different implementations.

Support AVX instruction.
Not the default. Need to modify Makefile and the source code to use it.
Support AVX and AVX2 instruction.
The default.

The issue is focused on using the correct AVX code implementation automatically based on the hardware supported instruction.

The text was updated successfully, but these errors were encountered:

jserv · 2018-11-17T09:57:07Z

AVX (not AVX2) support is crucial to AMD Ryzen since its AVX2 is known to be slower than Intel Core i9 series.

marktwtn · 2018-11-18T17:40:48Z

The -Ofast optimization level makes AVX version unable to finish the execution.
However, it does not happen on AVX2 version.
If we change the optimization level to -O3, the problem would disappear.

The GCC version has nothing to do with the problem.

The difference of the assembly code:

-O3

        vucomisd        %xmm3, %xmm0
        jp      .L114
        jne     .L114

-Ofast

        vcomisd %xmm3, %xmm0
        jne     .L114

These code happens when the __m256d type variable is compared to another constant value or uses its value in the logical operation.

Example:

__m256d nonce_probe = ...
...
nonce_probe[0] == LBITS

__m256d carry;
...
i == INCR_START || carry[0]

The jp instruction would jump if one of the previous comparison operand is NaN.
The NaN value are defined in the IEEE floating-point standard.

The bitwise operation in the PoW might created the NaN value.
However, the NaN value is not handled in the -Ofast optimization level since it optimizes out the jp instruction.

jserv · 2018-11-19T13:17:43Z

Let's stick to -O3 optimization order and explain for further tracking.

Close DLTcollab#83.

marktwtn self-assigned this Nov 13, 2018

marktwtn added a commit to marktwtn/dcurl that referenced this issue Nov 20, 2018

Support AVX implementation on the non-AVX2 hardware

796f4f2

Close DLTcollab#83.

marktwtn mentioned this issue Nov 20, 2018

Support AVX implementation on the non-AVX2 hardware #85

Merged

jserv closed this as completed in #85 Nov 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow dcurl to compile and execute AVX code implementation on the hardware not supporting AVX2 #83

Allow dcurl to compile and execute AVX code implementation on the hardware not supporting AVX2 #83

marktwtn commented Nov 13, 2018 •

edited

Loading

jserv commented Nov 17, 2018

marktwtn commented Nov 18, 2018

jserv commented Nov 19, 2018

Allow dcurl to compile and execute AVX code implementation on the hardware not supporting AVX2 #83

Allow dcurl to compile and execute AVX code implementation on the hardware not supporting AVX2 #83

Comments

marktwtn commented Nov 13, 2018 • edited Loading

jserv commented Nov 17, 2018

marktwtn commented Nov 18, 2018

jserv commented Nov 19, 2018

marktwtn commented Nov 13, 2018 •

edited

Loading