Batch Detection #7683

rafcy · 2022-05-03T13:03:50Z

Search before asking

I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

No response

Bug

Hello everyone,
I am trying tiling methods so what I am trying to do is get an image, split it into patches and batch-detect objects on those images but instead there is much more delay instead. I don't know what I am doing wrong but nothing compares to any batched inference times that are mentioned in the documentations. I am getting like 20 fps with an image inference of size 1080p (using yolov5s custom trained model) and when splitting the image into 15 patches I am getting 5 FPS.
Things I tried:

using both pytorch hub with ''ultralytics/yolov5" and with local repo
using the code in detect.py of Yolov5
I've tried stacking the images into a tensor, instead of a tuple, before passing them in the model(imgs) but then it returns a tensor of size [16128, 9 ] for each image instead of pandas. In order to get the actual results for each image I need to call the nms function on the tensor [batch,16128, 9] and results to a huge delay.
All these result to the same fps and yes, I am running using a GPU (RTX 2070)

Another example i've tried, is to split a 4k image into 60 patches of 512 x 512 and detect them with the pytorch hub example as a tuple. I am getting these results as per performance
Speed: 5.5ms pre-process, 56.4ms inference, 0.8ms NMS per image at shape (60, 3, 640, 640)
but it actually needed 3.7 seconds to run. So, the speeds are misleading because they represent the inference per image which makes no sense since I wanted batched inference.

Please help me specify if I am doing something wrong or if it is normal having these results and I should stop trying to find a solution to my issue.
Thank you in advance.

Environment

I am using a custom made Docker that includes:

nvcr.io/nvidia/tensorrt:21.05-py3
OpenCV v4.5.3 build with Cuda
torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
yolov5 requirements

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2022-05-03T17:22:04Z

@rafcy 👋 Hello! Thanks for asking about inference speed issues. PyTorch Hub speeds will vary by hardware, software, model, inference settings, etc. Our default example in Colab with a V100 looks like this:

YOLOv5 🚀 can be run on CPU (i.e. --device cpu, slow) or GPU if available (i.e. --device 0, faster). You can determine your inference device by viewing the YOLOv5 console output:

detect.py inference

python detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source data/images/

YOLOv5 PyTorch Hub inference

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')

# Images
dir = 'https://ultralytics.com/images/'
imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')]  # batch of images

# Inference
results = model(imgs)
results.print()  # or .show(), .save()
# Speed: 631.5ms pre-process, 19.2ms inference, 1.6ms NMS per image at shape (2, 3, 640, 640)

Increase Speeds

If you would like to increase your inference speed some options are:

Use batched inference with YOLOv5 PyTorch Hub
Reduce --img-size, i.e. 1280 -> 640 -> 320
Reduce model size, i.e. YOLOv5x -> YOLOv5l -> YOLOv5m -> YOLOv5s -> YOLOv5n
Use half precision FP16 inference with python detect.py --half and python val.py --half
Use a faster GPUs, i.e.: P100 -> V100 -> A100
Export to ONNX or OpenVINO for up to 3x CPU speedup (CPU Benchmarks)
Export to TensorRT for up to 5x GPU speedup (GPU Benchmarks)
Use a free GPU backends with up to 16GB of CUDA memory:

Good luck 🍀 and let us know if you have any other questions!

rafcy · 2022-05-04T12:26:12Z

hello @glenn-jocher and thank you for the response. So, as far as I understand with conducting some tests with the colab, batch detection does not help in any way with speeding up the detection process right?

glenn-jocher · 2022-05-04T19:32:56Z

@rafcy see https://community.ultralytics.com/t/yolov5-study-batch-size-vs-speed

rafcy · 2022-05-04T20:55:24Z

@rafcy see https://community.ultralytics.com/t/yolov5-study-batch-size-vs-speed

I have seen that, I did my own test as well in colab to check out the results. The inference time is indeed decreasing a bit when batches are increased but the overall delay gets a lot bigger.
You can see my colab test here:
https://colab.research.google.com/drive/1jmm0U_T1RNKQ3VSYKpKEulSZuR5-42-V?usp=sharing

1 image time: 0.0473322868347168
Speed: 18.2ms pre-process, 26.4ms inference, 2.1ms NMS per image at shape (1, 3, 384, 640)
4 images time: 0.17291879653930664
Speed: 17.2ms pre-process, 23.8ms inference, 1.8ms NMS per image at shape (4, 3, 384, 640)
8 images time: 0.37774181365966797
Speed: 17.1ms pre-process, 27.3ms inference, 2.5ms NMS per image at shape (8, 3, 384, 640)
16 images time: 0.5562183856964111
Speed: 17.7ms pre-process, 14.7ms inference, 2.0ms NMS per image at shape (16, 3, 384, 640)
32 images time: 1.035698413848877
Speed: 17.0ms pre-process, 13.1ms inference, 2.0ms NMS per image at shape (32, 3, 384, 640)

I know that I may sound annoying with my queries, but I am just trying to figure out if batch detection of your YoloV5 works for my application.
For instance, from your batch size comparison, you are basically stating for yolov5s model is almost the same overall duration as batch size 1, but through my comparison, this in fact is not true. Am I doing something wrong or am I not understanding something?

Batch Size || YOLOv5s
-- | -- |
1 | 1.0 |
8 | 7.0 |

Thank you in advance.

github-actions · 2022-06-04T00:22:41Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com/hub
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

rafcy added the bug Something isn't working label May 3, 2022

github-actions bot added the Stale label Jun 4, 2022

github-actions bot closed this as completed Jun 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch Detection #7683

Batch Detection #7683

rafcy commented May 3, 2022

glenn-jocher commented May 3, 2022 •

edited

Loading

rafcy commented May 4, 2022

glenn-jocher commented May 4, 2022

rafcy commented May 4, 2022

github-actions bot commented Jun 4, 2022 •

edited by glenn-jocher

Loading

Batch Detection #7683

Batch Detection #7683

Comments

rafcy commented May 3, 2022

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

glenn-jocher commented May 3, 2022 • edited Loading

detect.py inference

YOLOv5 PyTorch Hub inference

Increase Speeds

rafcy commented May 4, 2022

glenn-jocher commented May 4, 2022

rafcy commented May 4, 2022

github-actions bot commented Jun 4, 2022 • edited by glenn-jocher Loading

glenn-jocher commented May 3, 2022 •

edited

Loading

github-actions bot commented Jun 4, 2022 •

edited by glenn-jocher

Loading