CUDA out of memeory,I set batch_size 1,it does not work #1381

futureflsl · 2020-07-12T10:48:26Z

❔Question

c,I set batch_size 1,it does not work,My system is ubuntu 18.04 GTX1050,4GB Memory size.I set cfg images size 224,batch_size 1,when I try my custom dataset(it is coco dataset in face),like this
python train.py --batch_size 1
error occured:
CUDA out of memeory,
I think this project must have bug,it can not deal with memeory very well

Additional context

github-actions · 2020-07-12T10:49:15Z

Ultralytics has open-sourced YOLOv5 at https://github.com/ultralytics/yolov5, featuring faster, lighter and more accurate object detection. YOLOv5 is recommended for all new projects.

** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.

August 13, 2020: v3.0 release: nn.Hardswish() activations, data autodownload, native AMP.
July 23, 2020: v2.0 release: improved model definition, training and mAP.
June 22, 2020: PANet updates: new heads, reduced parameters, improved speed and mAP 364fcfd.
June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
June 9, 2020: CSP updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
May 27, 2020: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
April 1, 2020: Start development of future compound-scaled YOLOv3/YOLOv4-based PyTorch models.

Pretrained Checkpoints

Model	AP^val	AP^test	AP₅₀	Speed_GPU	FPS_GPU	params	FLOPS
YOLOv5s	37.0	37.0	56.2	2.4ms	416	7.5M	13.2B
YOLOv5m	44.3	44.3	63.2	3.4ms	294	21.8M	39.4B
YOLOv5l	47.7	47.7	66.5	4.4ms	227	47.8M	88.1B
YOLOv5x	49.2	49.2	67.7	6.9ms	145	89.0M	166.4B

YOLOv5x + TTA	50.8	50.8	68.9	25.5ms	39	89.0M	354.3B

YOLOv3-SPP	45.6	45.5	65.2	4.5ms	222	63.0M	118.0B

** AP^test denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.001
** Speed_GPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
** Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce by python test.py --data coco.yaml --img 832 --augment

For more information and to get started with YOLOv5 please visit https://github.com/ultralytics/yolov5. Thank you!

DarrVeter · 2020-07-24T13:45:58Z

Try decreasing image size by adding: --img 512 or 416, 320.

shahabe · 2020-08-06T13:10:14Z

I have the same issue and could not solve it by decreasing the image size.
My GPU has 12G memory but still getting the 'CUDA out of memory' after some iterations.
@futureflsl could you solve your issue?

Suncheng2019 · 2020-08-17T11:13:00Z

emm, I get the same problem.

adodd202 · 2020-08-26T13:35:37Z

I had same problem with my RTX 2070 (about 8 Gb VRAM). Changed my command to:
python train.py --data data/coco_1cls.data --batch-size 1 --img-size 224
Now it is training and has gotten past the first few epochs.

github-actions · 2020-09-26T00:31:35Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

glenn-jocher · 2023-11-14T17:27:26Z

@adodd202 glad to hear it's working for you now! It's common for the CUDA out of memory error to occur when training with a large image size or on a GPU with lower memory. Lowering the image size with --img-size and reducing the --batch-size as you did, can help alleviate this issue. Keep in mind that reducing the image size may affect detection accuracy. Good luck with your training!

futureflsl added the question Further information is requested label Jul 12, 2020

github-actions bot added the Stale label Sep 26, 2020

github-actions bot closed this as completed Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA out of memeory,I set batch_size 1,it does not work #1381

CUDA out of memeory,I set batch_size 1,it does not work #1381

futureflsl commented Jul 12, 2020

github-actions bot commented Jul 12, 2020 •

edited by glenn-jocher

Loading

DarrVeter commented Jul 24, 2020

shahabe commented Aug 6, 2020

Suncheng2019 commented Aug 17, 2020

adodd202 commented Aug 26, 2020 •

edited

Loading

github-actions bot commented Sep 26, 2020

glenn-jocher commented Nov 14, 2023

CUDA out of memeory,I set batch_size 1,it does not work #1381

CUDA out of memeory,I set batch_size 1,it does not work #1381

Comments

futureflsl commented Jul 12, 2020

❔Question

Additional context

github-actions bot commented Jul 12, 2020 • edited by glenn-jocher Loading

Pretrained Checkpoints

DarrVeter commented Jul 24, 2020

shahabe commented Aug 6, 2020

Suncheng2019 commented Aug 17, 2020

adodd202 commented Aug 26, 2020 • edited Loading

github-actions bot commented Sep 26, 2020

glenn-jocher commented Nov 14, 2023

github-actions bot commented Jul 12, 2020 •

edited by glenn-jocher

Loading

adodd202 commented Aug 26, 2020 •

edited

Loading