-
-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating to Cuda 11.1 and Ubuntu 20.04 #30
Comments
we're working on the LTS as part of a larger effort in #27, but unfortunately I've been short on time. It will happen though. Congrats on landing one, I feel it'll be a while before I'm able to. |
Thank you @jnfinitym! I'm looking forward to see your PR :) |
I'd be happy to test any PRs with my RTX 3090. |
Not sure if this is because of the outdated library or not (I'm new to CUDA), but this is what happens with the existing build: import tensorflow from tensorflow.python.client import device_lib tensorflow.config.list_physical_devices('GPU')
print(device_lib.list_local_devices())
|
my kernel just updated to CUDA 11.0 and I had an existing running jupyter container based on an older cuda version. @Manouchehri I ran your code without any runtime errors.
I think the cudnn have been resolved with the latest image, this is an older one. Something to consider for sure... people's computers are updating automatically, and usually containers don't care, here they do... Anyway, how did you set up the nvidia drivers on your computer? I'll push images with newer versions of cuda this week and ping you, but this would help debug for now. @Manouchehri any tips for getting the card? Been incredibly challenging. |
I installed the beta drivers off of Nvidia's website (thought the beta would be require as support was just added in the 455.23.04 release). wget "https://us.download.nvidia.com/XFree86/Linux-x86_64/455.23.04/NVIDIA-Linux-x86_64-455.23.04.run"
chmod +x NVIDIA-Linux-x86_64-455.23.04.run
sudo ./NVIDIA-Linux-x86_64-455.23.04.run # I kept the defaults, except I said "no" to having my Xorg config updated. It's a headless VM.
I joined the community NVIDIA Discord, and on the RTX 3090 launch day someone shared a link that would add the card directly to your cart, so you only had to load two or three pages. Basically cut the amount of clicks in half. It was a last minute change on the web store, so I don't think most bot authors had a chance to update their scripts before us humans grabbed all of them. (My order was placed at 9:13 AM EST, so it definitely didn't sell out in seconds like the RTX 3080.) |
Using docker run --gpus all -d -it -p 127.0.0.1:8888:8888 -v $(pwd)/data:/mnt/space/ml -e GRANT_SUDO=yes --name tf-nightly-gpu-jupyter_1 tensorflow/tensorflow:nightly-gpu-jupyter import tensorflow from tensorflow.python.client import device_lib tensorflow.config.list_physical_devices('GPU')
print(device_lib.list_local_devices())
|
@Manouchehri are there any upsides using this project instead of tensorflow's? I wasn't aware they had one like that. I suppose it has some different included packages, but practically speaking, how different are they? |
The tensorflow docker images are still based on ubuntu 18.04 btw: |
Trying to update to
Fails with:
Seems to be related to: pytorch/vision#3264 and pytorch/vision#3207 |
There also seems to be version conflicts on WSL 2 and Cuda 11.1: As soon as PyTorch, Tensorflow and WSL 2 accept Cuda 11, we will update it. Has anyone experienced severe disadvantages with Cuda 10.1? May be switching to 10.2 with 10.2-cudnn8-runtime-ubuntu18.04 would be an intermediate option? |
Cuda 10.2 seems to work well. |
Good news for this issue, I've created a branch v1.4_cuda-11.0_ubuntu-18.04 for images based on nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu18.04 |
I've created images for CUDA 11.0 and Ubuntu 20.04 that are available on Dockerhub:
I think I can close this issue now. If a new CUDA version is supported (especially for Tensorflow) you can reopen this issue. |
Hello, is the version v1.4_cuda-11.0_ubuntu-20.04 expected to work on cuda 11.0 ? it seems, it is still linked with cuda 10.1.
|
You are right. The problem was that TensorFlow was not updated and the older version depends on The update is on the way. |
The commit e6300cd should have solved this issue. The images are currently built and pushed. |
With all the shiny new GPUs coming out recently, I propose updating to use images that run on CUDA 11.1.
I will try to do that in a forked version in the next few weeks and if the maintainer(s) on here think this is a good plan, I am happy to submit a pull request once that is done, and as soon as I get mine, test it on an RTX3080 to make sure it runs as it should.
In the same breath, I also propose moving the images to the new Ubuntu LTS.
The text was updated successfully, but these errors were encountered: