Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do you speed up by TensorRT ? #45

Closed
wu-ruijie opened this issue Jun 12, 2020 · 18 comments
Closed

Do you speed up by TensorRT ? #45

wu-ruijie opened this issue Jun 12, 2020 · 18 comments
Labels
enhancement New feature or request Stale

Comments

@wu-ruijie
Copy link

v5 is so fast! I even dare to imagine how fast it is speeded up by TensorRT, Do you has any job about it ?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 12, 2020

Hello @wu-ruijie, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@glenn-jocher
Copy link
Member

glenn-jocher commented Jun 12, 2020

@wang-xinyu did a great TensorRT implementation of our https://github.com/ultralytics/yolov3 repo here (which supports both YOLOv3 and YOLOv4), he might best answer this question.
https://github.com/wang-xinyu/tensorrtx/tree/master/yolov3-spp

@glenn-jocher glenn-jocher added the enhancement New feature or request label Jun 12, 2020
@HaxThePlanet
Copy link

Tensor core support would be amazing!

@sljlp
Copy link

sljlp commented Jun 13, 2020

Hi! I tested yolo5-s on cpu by directly running detect.py and the inference speed is only 3 fps.Could you please give me some advice?I want to make it 30 fps at least.

@glenn-jocher
Copy link
Member

@sljlp you might want to see 'Running yolov5 on CPU' #37

The default --img-size for detect.py is 640, which you can reduce significantly to get the FPS you are looking for.

@glenn-jocher
Copy link
Member

@sljlp one caveat is --img-size must be a multiple of the largest stride, 32. So acceptable sizes are 320, 288, 256, etc.

@glenn-jocher
Copy link
Member

Update: I've pushed more robust error-checking on --img-size now in 099e6f5, so if a user accidentally requests an invalid size (which is not divisible by 32), the code will warn and automatically correct the value to the nearest valid --img-size.

@thancaocuong
Copy link

@glenn-jocher Can you provide yolov5.weights file. I've found that to convert yolo to tensorrt, we need the weights file to use with (https://github.com/wang-xinyu/tensorrtx/)

@glenn-jocher
Copy link
Member

@thancaocuong there is no such file.

@TrojanXu
Copy link

TrojanXu commented Jun 17, 2020

I have a python implementation here, with NMS, https://github.com/TrojanXu/yolov5-tensorrt

@wang-xinyu
Copy link
Contributor

Hi @glenn-jocher

I just implemented yolov5-s in my repo https://github.com/wang-xinyu/tensorrtx/tree/master/yolov5
, and test on my machine. yolov5-m, yolov5-l, etc, will come out soon.

Models Device BatchSize Mode Input Shape(HxW) FPS
YOLOv3-spp(darknet53) Xeon E5-2620/GTX1080 1 FP16 608x608 38.5
YOLOv4(CSPDarknet53) Xeon E5-2620/GTX1080 1 FP16 608x608 35.7
YOLOv5-s Xeon E5-2620/GTX1080 1 FP16 608x608 167
YOLOv5-s Xeon E5-2620/GTX1080 4 FP16 608x608 182
YOLOv5-s Xeon E5-2620/GTX1080 8 FP16 608x608 186

@wang-xinyu
Copy link
Contributor

Update! My tensorrt implementation already updated according to this commit 364fcfd

The PANet updated.

Please find my repo https://github.com/wang-xinyu/tensorrtx

@alexandrebvd
Copy link

Update! My tensorrt implementation already updated according to this commit 364fcfd

The PANet updated.

Please find my repo https://github.com/wang-xinyu/tensorrtx

Thanks for sharing! Do you have plans to implement other yolov5 versions as well?

@github-actions
Copy link
Contributor

github-actions bot commented Aug 3, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Aug 3, 2020
@wang-xinyu
Copy link
Contributor

We have updated the yolov5 tensorrt according to the v2.0 release of this repo.

And made speed test on my machine.

Models Device BatchSize Mode Input Shape(HxW) FPS
YOLOv5-s Xeon E5-2620/GTX1080 1 FP16 608x608 142
YOLOv5-s Xeon E5-2620/GTX1080 4 FP16 608x608 173
YOLOv5-s Xeon E5-2620/GTX1080 8 FP16 608x608 190
YOLOv5-m Xeon E5-2620/GTX1080 1 FP16 608x608 71
YOLOv5-l Xeon E5-2620/GTX1080 1 FP16 608x608 40
YOLOv5-x Xeon E5-2620/GTX1080 1 FP16 608x608 27

please find https://github.com/wang-xinyu/tensorrtx.

@glenn-jocher could you also add a link to https://github.com/wang-xinyu/tensorrtx in your Tutorials section?

@glenn-jocher
Copy link
Member

glenn-jocher commented Aug 4, 2020

@wang-xinyu thanks, yes this is a good idea. Can you submit a PR for the README please?

EDIT: I'll add a link to the export tutorial also.

@ttanzhiqiang
Copy link

@fire717
Copy link

fire717 commented Nov 4, 2021

We have updated the yolov5 tensorrt according to the v2.0 release of this repo.

And made speed test on my machine.

Models Device BatchSize Mode Input Shape(HxW) FPS
YOLOv5-s Xeon E5-2620/GTX1080 1 FP16 608x608 142
YOLOv5-s Xeon E5-2620/GTX1080 4 FP16 608x608 173
YOLOv5-s Xeon E5-2620/GTX1080 8 FP16 608x608 190
YOLOv5-m Xeon E5-2620/GTX1080 1 FP16 608x608 71
YOLOv5-l Xeon E5-2620/GTX1080 1 FP16 608x608 40
YOLOv5-x Xeon E5-2620/GTX1080 1 FP16 608x608 27
please find https://github.com/wang-xinyu/tensorrtx.

@glenn-jocher could you also add a link to https://github.com/wang-xinyu/tensorrtx in your Tutorials section?

Thx for your work, I just wonder how do u test FPS with batchsize.
Cause our video is just one img flow, every img is in a serial line, so why could u use batchsize more than 1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

No branches or pull requests

10 participants