-
-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutli GPU Training #7608
Comments
@Venky0892 see Multi-GPU Training tutorial for correct commands: YOLOv5 Tutorials
Good luck 🍀 and let us know if you have any other questions! |
Hi @glenn-jocher I want to use four GPU's so gave --nproc_ per_node = 4 I could only see GPU 0 is active and running. I'm not sure it could be great if you can assits me in this |
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs. Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed! Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐! |
@Venky0892 The command you provided looks correct, so it's strange that only GPU 0 is active. Would you mind checking if the |
Search before asking
Question
Hi, I trying to train around 200K images via Tesla V100, I have 4 of these in my compute instance. I'm training with this command "python -m torch.distributed.launch --nproc_per_node 4" as mentioned in the documentation. But I could only able see just one GPU is active, I'm not sure why? I checked the GPU usage metric in Wandb tool. Can someone help me with this? If I run normal training without distributed launch I could see all the GPU are being used.
Additional
No response
The text was updated successfully, but these errors were encountered: