Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable ARCH_OPT_FLAGS in mkl-dnn #618

Merged
merged 1 commit into from
Sep 12, 2018

Conversation

okapies
Copy link
Contributor

@okapies okapies commented Sep 11, 2018

At least v0.16, MKL-DNN try to enable -march=native -mtune=native when -DARCH_OPT_FLAGS is not specified (ref):

set(ARCH_OPT_FLAGS "HostOpts" CACHE STRING
    "specifies compiler optimization flags (see below for more information).
    If empty default optimization level would be applied which depends on the
    compiler being used.

    - For Intel(R) C++ Compilers the default option is `-xHOST` which instructs
      the compiler to generate the code for the architecture where building is
      happening. This option would not allow to run the library on older
      architectures.

    - For GNU* Compiler Collection version 5 and newer the default options are
      `-march=native -mtune=native` which behaves similarly to the descriprion
      above.

    - For all other cases there are no special optimizations flags.

    If the library is to be built for generic architecture (e.g. built by a
    Linux distributive maintainer) one may want to specify ARCH_OPT_FLAGS=\"\"
    to not use any host specific instructions")

It has a chance to fail when you build and distribute the mkl-dnn binary. For example, GCC generates newer CPU instructions supported by the detected processors even if as (assembler) in older binutils in your machine doesn't support them (ref):

# make
[  1%] Building CXX object src/CMakeFiles/mkldnn.dir/common/batch_normalization.cpp.o
/tmp/ccs0bO1c.s: Assembler messages:
/tmp/ccs0bO1c.s:211: Error: suffix or operands invalid for `vpsrlq'
/tmp/ccs0bO1c.s:212: Error: suffix or operands invalid for `vpmovsxdq'
/tmp/ccs0bO1c.s:213: Error: no such instruction: `vextracti128 $0x1,%ymm2,%xmm2'
/tmp/ccs0bO1c.s:214: Error: suffix or operands invalid for `vpmuludq'
/tmp/ccs0bO1c.s:215: Error: suffix or operands invalid for `vpsrlq'
/tmp/ccs0bO1c.s:216: Error: suffix or operands invalid for `vpmuludq'
/tmp/ccs0bO1c.s:217: Error: suffix or operands invalid for `vpmovsxdq'
...

It might be a problem especially for building the mkl-dnn binary on virtualized machines because we can't specify which processor is used.

@saudet
Copy link
Member

saudet commented Sep 11, 2018 via email

@okapies
Copy link
Contributor Author

okapies commented Sep 11, 2018

Do you mean such like -march=sandybridge to support Sandy Bridge and later?

https://wiki.gentoo.org/wiki/Safe_CFLAGS

It seems to be good, but I don’t have idea which option is preferable, or minimum requirement for mkl-dnn. We might ask MKL-DNN community which option is good for default.

@saudet
Copy link
Member

saudet commented Sep 12, 2018

Something like -msse4.1 -msse4.2 -mavx is probably safer for AMD CPUs, but yeah I don't know if it really matters and might be best to leave it to the defaults for now...

@okapies
Copy link
Contributor Author

okapies commented Sep 12, 2018

I posted an issue to mkl-dnn repository. oneapi-src/oneDNN#321

@emfomenk
Copy link

Hi, MKL-DNN dev here.

Just want to note that Intel MKL-DNN can be compiled with whatever level of optimization is suitable for a user.

For most of the primitives Intel MKL-DNN generates the code on its own at the runtime for the most appropriate instruction set based on the hardware you run the library, so strictly speaking -mnative is not that important. However, there are few cases where we rely on the compiler optimizations. That's why we use -mnative by default.

@saudet saudet merged commit dee6581 into bytedeco:master Sep 12, 2018
@okapies okapies deleted the feature/disable-mkldnn-arch-opt branch September 12, 2018 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants