Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pytorch] move to GPU extremely slow. #1083

Closed
guangster opened this issue Sep 15, 2021 · 2 comments
Closed

[Pytorch] move to GPU extremely slow. #1083

guangster opened this issue Sep 15, 2021 · 2 comments

Comments

@guangster
Copy link

When moving the model to GPU using module.to(...), the program hangs for an extremely long time (almost half an hour) even though the model is tiny and can be created on GPU nearly instantly on python.

This is the code snippet where the freeze occurs. If I wait long enough, eventually the code continues fine, so I think CUDA is ok.

Module model = ...
DeviceGuard g = new DeviceGuard(new Device(torch.kCUDA()));
model.to(g.current_device()); // hangs here for a long time

Any help is appreciated!

@saudet
Copy link
Member

saudet commented Sep 16, 2021

The binaries are compiled only for a single GPU architecture, so it's probably just JIT compiling PTX code for your GPU architecture. If you set your compute cache large enough to something like 256 MB, it should only do it once. You could also build from source for your architecture, but we cannot easily build the binaries with more architectures on GitHub Actions since that would take longer than the hard limit of 6 hours, and I don't have the resources to maintain any additional infrastructure.

Another thing you could do is use the binaries from LibTorch. Simply extract it somewhere on your system, include its libraries somewhere in your system PATH, and set the "org.bytedeco.javacpp.pathsFirst" system property to "true" before loading anything with JavaCPP.

Incidentally, that is also one way that you could make JavaCPP load the same libraries as DJL.
/cc @frankfliu @stu1130

@guangster
Copy link
Author

guangster commented Sep 17, 2021

Thanks! this worked!
For clarity, I followed the second advice above to download the binaries matching my cuda version (11.1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants