-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update/wrap to cuda-convnet2 #1044
Comments
Other info by @memimo on pylearn-dev: I'm not sure what you mean by requestion temp memory. But, yes it still use B01C order I think this is still the best option for pylearn2 for following reasons: -Its interface hasn't change much, so we can update our wrap with the least amount of effort |
This is probably worth keeping an eye on: https://github.com/soumith/convnet-benchmarks |
Soumith just posted some results for a single convolutional layer (see README in his repo). Looks like this is definitely going to be worth the effort :) |
If it's Apache-licensed then we cannot include it in pylearn2 directly. The I think the Theano ops are the right layer at which to do this. On Thu, Jul 24, 2014 at 7:43 PM, Frédéric Bastien notifications@github.com
|
This stack exchange post seems to suggest that you should be fine including Apache licensed code in a BSD licensed project, provided that you also include the Apache license file with the module that was released with cuda_convnet2: http://programmers.stackexchange.com/questions/40561/is-bsd-license-compatible-with-apache. Here's the relevant bit of the Apache license: http://www.apache.org/licenses/LICENSE-2.0.html#redistribution. Don't know if the reasons that pylearn2 must stay BSD prohibit that format, though. |
Take a look here |
Hi, the apache license have much more restriction on the users the bsd. For I think we could look into making a separate repo, but add in the setup.py Anyway, in all case, we first need someone to do the wrapper. We can always Fred On Fri, Aug 8, 2014 at 5:58 AM, bhack notifications@github.com wrote:
|
Is there anyone actively working on this port? I'd be very interested in moving forward on this issue technically, even if there are licensing constraints that we'd have to consider later on when integrating with pylearn2. The support offered for multiple gpus would be an excellent value add for pylearn2. This years ILSVRC competition featured VGG's convnet trained for ~4-6 weeks on 4 gpus. On a single gpu that kind of computation would be infeasible, and it would be great to have pylearn2 help facilitate research at that scale. I understand that @goodfeli and @dwf were responsible for the original wrapper around cuda-convnet for pylearn, and would be curious to hear what your estimates would be for a port of Krizhevsky's cuda-convnet2 library. A cursory comparison of cuda-convnet2 makes it seem like the high-level interface to the library has stayed very similar, so I would anticipate a port being pretty feasible with a few weekends worth of dedicated work. I'd also appreciate a quick assessment of whether or not the Krizhevsky's hybrid data/model parallelism method would play well with Theano -- if not, pure data parallelism might provide most of the benefit with a smaller amount of effort. Even if multiple-gpu support may require a longer term effort to port, the improved train times on Kepler gpus would still be a nice value add. |
As long as the interface is indeed similar, you're right that it should 2014-09-07 22:34 GMT-07:00 Madison May notifications@github.com:
|
the cuda-convnet2 can't be put in Theano or Pylearn2 due to license. It Moving the cuda-convnet to Theano make sence, but it probably will become For the multi-GPU, we should talk about that in theano-dev. We have short Fred On Mon, Sep 8, 2014 at 9:55 AM, Ian Goodfellow notifications@github.com
|
Has CuDNN been compared against cuda-convnet2? I found it odd that the blog post about CuDNN made no mention of it. Soumith's benchmarks seem to indicate that cuda-convnet2 beats the Caffe gemm approach for a few configurations ( https://github.com/soumith/convnet-benchmarks ). Since CuDNN is supposedly only 1.2x - 1.3x faster than Caffe, it might still be beneficial to use cuda-convnet2 for certain configurations. It might not be worth the effort though... perhaps it would be a good idea to wait with that decision until CuDNN support is implemented so it can be included in the benchmarks. If cuda-convnet2 still turns out to have an edge for some input configurations, a more informed decision can be made. |
My guess is that cudnn will get updated until it always bet cuda-convnet2. I agree, it would be good to have it in the benchmark to know the current On Mon, Sep 8, 2014 at 10:17 AM, Sander Dieleman notifications@github.com
|
@goodfeli, thanks for the analysis. It seems like the general consensus is that any sort of integration should be addressed at the Theano level rather than the pylearn2 level, so I will gladly move that discussion to the theano-dev mailing list. And thanks for the correction w/ regards to Krizhevsky. @nouiz, it looks like caffe's integration of cuDNN required many thousand lines of code, so I'm not sure how short-term that project will be. I'd like to stay up to date on that progress, though. I was unable to find an open issue / PR about multi-gpu support on the theano github page -- if that does exist, do you think you could drop in a link to that? I'm of the opinion that it would still be worth pursuing the cuda-convnet2 integration in parallel, since as @benanne mentions it's unlikely that the difference in performance between the two will be too substantial. |
There is no ticket about cudnn. I just created one: Last Friday, @abergeron finished the first version of our wrapping of there On Mon, Sep 8, 2014 at 11:42 AM, Madison May notifications@github.com
|
Yeah, the estimate in Krizhevsky's paper was that ~90% of the speedup from multi-gpu support could be achieved by supporting data parallelism in the conv layers. Thanks for creating that ticket. |
Bump :) Is this still being considered? Soumith's latest benchmarks ( https://github.com/soumith/convnet-benchmarks ) show that cuda-convnet2 is pretty competitive for some configurations, even compared to cudnn R2. I am still using the cuda-convnet wrappers a lot, because even on the GTX 980, I can still get substantial speedups from them compared to all the other convolution implementations that are now available in Theano. So I imagine cuda-convnet2 would probably be even faster for my use cases. I'm willing to help with this if I can be of any use, but someone else would need to take the lead as I'm not comfortable at all with C/C++. |
Time to upgrade pylearn2's warp of cuda-convnet:
https://code.google.com/p/cuda-convnet2/
https://plus.google.com/u/0/+AlexKrizhevsky/posts/GeGh4j7kDcR
Need to check for the license, it is apache. Also will probably need to select the old or new version depending of the user GPU, as the new one don't do this. Or at least, test if the new one work and isn't slower on older GTX580
The text was updated successfully, but these errors were encountered: