CUDA memory leak #2586

awsaf49 · 2021-03-24T13:33:36Z

I'm getting CUDA out of memory for training v5 on 1024 size Image with batch_size 4. I have trained v5 on 1024 image_size with batch_size 4 before without any error. But now I'm getting this error.
But the weird fact is while training, GPU usage is around 10GB and it trains perfectly. But while validating it throws the error.

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2021-03-24T14:41:38Z

@awsaf49 I reviewed the code here, it seems like testing batch sizes were inadvertently affected by PR #2125, I've pushed a fix for this in #2587, though note this should only affect Multi-GPU training memory issues during testing.

awsaf49 · 2021-03-25T21:09:02Z

But I was using a single GPU also unfortunately it wasn't solved by this change.

glenn-jocher · 2021-03-25T22:11:08Z

@awsaf49 thanks for the info! If you believe you have a reproducible issue, we suggest you close this issue and raise a new one using the 🐛 Bug Report template, providing screenshots and a minimum reproducible example to help us better understand and diagnose your problem. Thank you!

awsaf49 added the bug Something isn't working label Mar 24, 2021

awsaf49 closed this as completed Mar 25, 2021

This was referenced Apr 11, 2021

YOLOv5 v5.0 Release #2762

Merged

YOLOv5 v5.0 release compatibility update for YOLOv3 ultralytics/yolov3#1737

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA memory leak #2586

CUDA memory leak #2586

awsaf49 commented Mar 24, 2021 •

edited

Loading

glenn-jocher commented Mar 24, 2021

awsaf49 commented Mar 25, 2021

glenn-jocher commented Mar 25, 2021 •

edited

Loading

CUDA memory leak #2586

CUDA memory leak #2586

Comments

awsaf49 commented Mar 24, 2021 • edited Loading

glenn-jocher commented Mar 24, 2021

awsaf49 commented Mar 25, 2021

glenn-jocher commented Mar 25, 2021 • edited Loading

awsaf49 commented Mar 24, 2021 •

edited

Loading

glenn-jocher commented Mar 25, 2021 •

edited

Loading