Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with NVIDIA Benchmarks #98

Open
yl-jiang opened this issue May 16, 2018 · 3 comments
Open

Problems with NVIDIA Benchmarks #98

yl-jiang opened this issue May 16, 2018 · 3 comments

Comments

@yl-jiang
Copy link

Environment:

  1. GPU cards: Tesla K80
  2. CUDA:8.0
  3. cuDNN:5.1
  4. OpenMPI:1.10.2

Problems:

After make there are five files in .../nvidia/bin , they are:

conv_bench gemm_bench nccl_mpi_all_reduce nccl_single_all_reduce rnn_bench

And I can successfully run 'rnn_bench', 'nccl_single_all_reduce',

  1. But when I run 'gemm_bench' it give me the error of "terminate called after throwing an instance of 'std::runtime_error'";
  2. run 'conv_bench' it will be stop when procedure doing the 11th test,and the error is " terminate called after throwing an instance of 'std::runtime_error' what(): Illegal algorithm passed to get_fwd_algo_string. Algo: 7"
  3. run 'nccl_mpi_all_reduce' the error is "terminate called after throwing an instance of 'std::runtime_error'what(): NCCL failure: invalid device pointer in nccl_mpi_all_reduce.cu at line: 86 rank: 0"

How can I fix it?

@sharannarang
Copy link
Contributor

I haven't really tested DeepBench kernels for K80. Are you sure you compiled with the correct SM version? Are the drivers updated to run with CUDA 8.0?

@jfurtek
Copy link

jfurtek commented May 25, 2018

1.) As currently written, gemm_bench will fail for Kepler GPUs for CUDA 8 and later. cublasGemmEx() is only supported on GPUs with SM 5.0 or greater (i.e. Maxwell and newer).
https://docs.nvidia.com/cuda/cublas/index.html#cublas-GemmEx

  1. Algo 7 is CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED, and DeepBench has a case statement for that in get_fwd_algo_string() when CUDNN_MAJOR >= 6. Maybe a pre-cuDNNv6 header file was in your include path?

@yl-jiang
Copy link
Author

I have changed CUDA version to 7.5 , cuDNN version to 5.0, and now the deepbench can run most of the benchmarks but except the 'nccl_mpi_all_reduce'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants