cuDNN acceleration #1046

shelhamer · 2014-09-07T18:08:17Z

Caffe + cuDNN is an NVIDIA-Caffe collaboration for deep learning. cuDNN is an acceleration library for deep network operations with drop-in integration to Caffe. It is a free library downloadable with CUDA developer registration. It requires CUDA >= 6.5. This combination is the fastest public framework for deep learning in vision when benchmarked on the AlexNet / CaffeNet architectures with overall model speedups of 1.2-1.5x and layer-wise speedups of 1.2-3x over standard Caffe. Caffe + cuDNN lets you define your models just as before while taking advantage of these computational speedups.

In this first release cuDNN includes

convolution
pooling
nonlinearities (ReLU, Sigmoid, TanH)
softmax

These operations are drop-in accelerations of the Caffe layers. To switch on acceleration, set

USE_CUDNN := 1

in your Makefile.config during installation. Layers will be accelerated by default.

NVIDIA and Caffe will coordinate future releases to further accelerate computation and introduce new features. NVIDIA has committed to tuning cuDNN to current and future GPU architectures.

Caffe is free and open-source and cuDNN is a CUDA developer library like cuBLAS and cuRAND.

Check out the cuDNN site, the Caffe's latest roast slides, and NVIDIA parallel forall blog announcement!

Thanks to the cuDNN team for this collaboration and special thanks to Cliff Woolley for his attention to detail.

Note on convolution: the cuDNN convolution aims to match or exceed the speed of Caffe's own matrix-multiplication approach while reducing memory usage. In many input and model regimes it accelerates the computation 1.3-3x and never requires buffers. In certain cases of fully-convolutional models or large inputs the Caffe convolution is slightly faster at the cost of more memory usage -- this is a direction for further optimization.

To pick the computational engine per-layer in your models, set the engine: CAFFEor engine: CUDNN field in the {convolution,pooling,relu,sigmoid,tanh,softmax}_param in your model definition:

layers {
  type: CONVOLUTION
  ...
  convolution_param {
    engine: CAFFE
    ...
  }
}

sguada · 2014-09-07T18:46:17Z

@shelhamer the slides are not public

shelhamer · 2014-09-07T23:28:38Z

Thank you for pointing that out. The slides are now public.

On Sunday, September 7, 2014, Sergio Guadarrama notifications@github.com
wrote:

@shelhamer https://github.com/shelhamer the slides are not public

—
Reply to this email directly or view it on GitHub
#1046 (comment).

niuzhiheng · 2014-09-08T03:22:58Z

Awesome!
The NVidia blog for this is here.

OpenHero · 2014-09-08T03:33:21Z

It only published a lib, but no source code.
ᐧ

On Mon, Sep 8, 2014 at 11:23 AM, NIU ZHIHENG notifications@github.com
wrote:

Awesome!
The NVidia blog for this is here
http://devblogs.nvidia.com/parallelforall/accelerate-machine-learning-cudnn-deep-neural-network-library/
.

—
Reply to this email directly or view it on GitHub
#1046 (comment).

Best wishes,
Kaiyong Zhao

cuDNN acceleration

Yangqing · 2014-09-14T18:28:50Z

Just for the record - when compiled with cudnn and no changes are made to the pre-cudnn protobuffers, the default behavior is to use the cudnn implementation of things.

qianghuang84 · 2014-09-22T06:45:16Z

cool component

sguada · 2014-09-27T06:54:31Z

@shelhamer it is a bit annoying to get so many warnings when Falling back to standard Caffe. I think it could say it during the Setup and then don't say it again.

W0926 23:53:11.389243 18332 cudnn_pooling_layer.cu:17] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.390046 18332 cudnn_pooling_layer.cu:17] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.546059 18332 cudnn_pooling_layer.cu:36] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.546967 18332 cudnn_pooling_layer.cu:36] Falling back to standard Caffe for padded pooling.
W0926 23:53:11.547852 18332 cudnn_pooling_layer.cu:36] Falling back to standard Caffe for padded pooling.

See #1170

cuDNN acceleration

shelhamer added 9 commits September 7, 2014 19:25

add cuDNN to build

77d9124

call __signbit for CUDA >= 6.5 implementation

8819f59

strategize cuDNN convolution

d1b38ee

strategize cuDNN pooling

00f5fa6

strategize cuDNN activations: ReLU, Sigmoid, TanH

14a9198

strategize cuDNN softmax

84bd1f5

CUDNN_CHECK

9e3d86f

report cuDNN error string

c65d5a0

[docs] include cuDNN in installation and performance reference

359197b

shelhamer added the enhancement label Sep 8, 2014

shelhamer added a commit that referenced this pull request Sep 8, 2014

Merge pull request #1046 from shelhamer/cudnn

3bafe2f

cuDNN acceleration

shelhamer merged commit 3bafe2f into BVLC:dev Sep 8, 2014

bhack mentioned this pull request Sep 8, 2014

Device Abstraction #610

Closed

This was referenced Sep 18, 2014

how to use cuDNN in caffe #1103

Closed

[cancelled] Next #1109

Merged

shelhamer deleted the cudnn branch September 19, 2014 04:36

shelhamer mentioned this pull request Sep 19, 2014

Next: release candidate #1112

Merged

mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014

Merge pull request BVLC#1046 from shelhamer/cudnn

ca33841

cuDNN acceleration

RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014

Merge pull request BVLC#1046 from shelhamer/cudnn

33f4824

cuDNN acceleration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuDNN acceleration #1046

cuDNN acceleration #1046

shelhamer commented Sep 7, 2014

sguada commented Sep 7, 2014

shelhamer commented Sep 7, 2014

niuzhiheng commented Sep 8, 2014

OpenHero commented Sep 8, 2014

Yangqing commented Sep 14, 2014

qianghuang84 commented Sep 22, 2014

sguada commented Sep 27, 2014

cuDNN acceleration #1046

cuDNN acceleration #1046

Conversation

shelhamer commented Sep 7, 2014

sguada commented Sep 7, 2014

shelhamer commented Sep 7, 2014

niuzhiheng commented Sep 8, 2014

OpenHero commented Sep 8, 2014

Yangqing commented Sep 14, 2014

qianghuang84 commented Sep 22, 2014

sguada commented Sep 27, 2014