[Pytorch] move to GPU extremely slow. #1083

guangster · 2021-09-15T17:40:41Z

When moving the model to GPU using module.to(...), the program hangs for an extremely long time (almost half an hour) even though the model is tiny and can be created on GPU nearly instantly on python.

This is the code snippet where the freeze occurs. If I wait long enough, eventually the code continues fine, so I think CUDA is ok.

Module model = ...
DeviceGuard g = new DeviceGuard(new Device(torch.kCUDA()));
model.to(g.current_device()); // hangs here for a long time

Any help is appreciated!

The text was updated successfully, but these errors were encountered:

saudet · 2021-09-16T02:28:45Z

The binaries are compiled only for a single GPU architecture, so it's probably just JIT compiling PTX code for your GPU architecture. If you set your compute cache large enough to something like 256 MB, it should only do it once. You could also build from source for your architecture, but we cannot easily build the binaries with more architectures on GitHub Actions since that would take longer than the hard limit of 6 hours, and I don't have the resources to maintain any additional infrastructure.

Another thing you could do is use the binaries from LibTorch. Simply extract it somewhere on your system, include its libraries somewhere in your system PATH, and set the "org.bytedeco.javacpp.pathsFirst" system property to "true" before loading anything with JavaCPP.

Incidentally, that is also one way that you could make JavaCPP load the same libraries as DJL.
/cc @frankfliu @stu1130

guangster · 2021-09-17T00:02:27Z

Thanks! this worked!
For clarity, I followed the second advice above to download the binaries matching my cuda version (11.1).

saudet added enhancement help wanted question labels Sep 16, 2021

guangster closed this as completed Sep 17, 2021

saudet mentioned this issue Sep 24, 2021

Pytorch preset missing torch::jit related functions #1068

Closed

scott01272001 mentioned this issue Dec 16, 2021

pytorch binding use libtorch with cuda 11.1 #1105

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pytorch] move to GPU extremely slow. #1083

[Pytorch] move to GPU extremely slow. #1083

guangster commented Sep 15, 2021

saudet commented Sep 16, 2021

guangster commented Sep 17, 2021 •

edited

Loading

[Pytorch] move to GPU extremely slow. #1083

[Pytorch] move to GPU extremely slow. #1083

Comments

guangster commented Sep 15, 2021

saudet commented Sep 16, 2021

guangster commented Sep 17, 2021 • edited Loading

guangster commented Sep 17, 2021 •

edited

Loading