Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT inference with C++ for yolov7 #95

Open
linghu8812 opened this issue Jul 12, 2022 · 12 comments
Open

TensorRT inference with C++ for yolov7 #95

linghu8812 opened this issue Jul 12, 2022 · 12 comments

Comments

@linghu8812
Copy link
Contributor

linghu8812 commented Jul 12, 2022

Hello every one, the repo which support yolov4: AlexeyAB/darknet#7002, scaled yolov4: WongKinYiu/ScaledYOLOv4#56, yolov5: ultralytics/yolov5#1597, and yolov6: meituan/YOLOv6#122 TensorRT inference with C++ is also support yolov7 inference, all the yolov7 pretrained model can be convert to onnx model and then to tensorrt engine.

1.Export ONNX Model

Use the following command to export onnx model:
first download yolov7 models to folder weights,

git clone https://github.com/linghu8812/yolov7.git
cd yolov7
python export.py --weights ./weights/yolov7.pt --simplify --grid 

if you want to export onnx model with 1280 image size add --img-size in command:

python export.py --weights ./weights/yolov7-w6.pt --simplify --grid --img-size 1280

2.uild yolov7_trt Project

mkdir build && cd build
cmake ..
make -j

3.Run yolov7_trt

  • inference with yolov7
./yolov7_trt ../config.yaml ../samples

4.Results:

image

@WongKinYiu
Copy link
Owner

Thanks.

@philipp-schmidt
Copy link
Contributor

@linghu8812
Your changes to the export.py are very useful. Why not make this a PR?
The onnx-simplify step is necessary for the ONNX to work correctly for the Detect() layer in many cases.
So it's a good idea to have that in there anyway.

@linghu8812
Copy link
Contributor Author

linghu8812 commented Jul 13, 2022

@linghu8812 Your changes to the export.py are very useful. Why not make this a PR? The onnx-simplify step is necessary for the ONNX to work correctly for the Detect() layer in many cases. So it's a good idea to have that in there anyway.

@philipp-schmidt I have already make a PR #114

@BenRK-Work
Copy link

@linghu8812 what version of onnxsim are you using? I return get the following error when trying:

Simplifier failure: [ONNXRuntimeError] : 1 : FAIL : Node (Mul_390) Op (Mul) [ShapeInferenceError] Incompatible dimensions

@linghu8812
Copy link
Contributor Author

@linghu8812 what version of onnxsim are you using? I return get the following error when trying:

Simplifier failure: [ONNXRuntimeError] : 1 : FAIL : Node (Mul_390) Op (Mul) [ShapeInferenceError] Incompatible dimensions

0.3.6

@akashAD98
Copy link
Contributor

@linghu8812 good work as always, is there is a good way to learn all this model optimization & quantization? can you teach us / mentoring? I really want to understand each & every point for model conversion. thanks

@leeyunhome
Copy link

Hello, @linghu8812

Thank you for your effort.
Have you compared the performance difference with yolov5?

Thank you.

@oralian
Copy link

oralian commented Jul 20, 2022

Hey, I'm getting:

Namespace(batch_size=2, device='0', dynamic=False, grid=False, img_size=[1024, 1024], simplify=True, weights='yolov7.pt')
YOLOR 🚀 v0.1-38-ge9f7c15 torch 1.10.0 CUDA:0 (Xavier, 7773.43359375MB)

Fusing layers... 
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
Model Summary: 306 layers, 36905341 parameters, 36905341 gradients
Killed

My using a Jetson NX and yolov7.pt. It seems that it crashes at y = model(img) in export.py with the ram reaching the maximum it can. Any ideas on how I could get this to work?

@oralian
Copy link

oralian commented Jul 20, 2022

Hey, I'm getting:

Namespace(batch_size=2, device='0', dynamic=False, grid=False, img_size=[1024, 1024], simplify=True, weights='yolov7.pt')
YOLOR 🚀 v0.1-38-ge9f7c15 torch 1.10.0 CUDA:0 (Xavier, 7773.43359375MB)

Fusing layers... 
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
Model Summary: 306 layers, 36905341 parameters, 36905341 gradients
Killed

My using a Jetson NX and yolov7.pt. It seems that it crashes at y = model(img) in export.py with the ram reaching the maximum it can. Any ideas on how I could get this to work?

I was able to solve my issue by generating the onnx file on my desktop computer and copying it on the Jetson. I was then able to convert it to TensorRT. However, I'm getting 0.094 sec inference time with the yolov7.pt weights vs 0.058 sec with the yolov5m6.pt weights. Is it supposed to be like this? Are there any faster yolov7 models? Thanks!

@mochechan
Copy link

For my evaluation, the step 1 uses a computer with rtx 2080ti. This step seems to be fine.
The step 2 and 3 use a nvidia jetson xavier nx with jetpack 4.5.1.

The step "3.Run yolov7_trt" occurs the following error messages. How to solve the problem?

$ ./yolov7_trt ../config.yaml ../samples
----------------------------------------------------------------
Input filename:   ../yolov7.onnx
ONNX IR version:  0.0.6
Opset version:    12
Producer name:    pytorch
Producer version: 1.10
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
ERROR: builtin_op_importers.cpp:3040 In function importSlice:
[4] Assertion failed: -r <= axis && axis < r
[07/21/2022-09:32:58] [E] Failure while parsing ONNX file
start building engine
[07/21/2022-09:32:58] [E] [TRT] Network must have at least one output
[07/21/2022-09:32:58] [E] [TRT] Network validation failed.
build engine done
yolov7_trt: /home/a/tensorrt_inference/yolov7/../includes/common/common.hpp:138: void onnxToTRTModel(const string&, const string&, nvinfer1::ICudaEngine*&, const int&): Assertion `engine' failed.
Aborted (core dumped)

@linghu8812
Copy link
Contributor Author

For my evaluation, the step 1 uses a computer with rtx 2080ti. This step seems to be fine. The step 2 and 3 use a nvidia jetson xavier nx with jetpack 4.5.1.

The step "3.Run yolov7_trt" occurs the following error messages. How to solve the problem?

$ ./yolov7_trt ../config.yaml ../samples
----------------------------------------------------------------
Input filename:   ../yolov7.onnx
ONNX IR version:  0.0.6
Opset version:    12
Producer name:    pytorch
Producer version: 1.10
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
ERROR: builtin_op_importers.cpp:3040 In function importSlice:
[4] Assertion failed: -r <= axis && axis < r
[07/21/2022-09:32:58] [E] Failure while parsing ONNX file
start building engine
[07/21/2022-09:32:58] [E] [TRT] Network must have at least one output
[07/21/2022-09:32:58] [E] [TRT] Network validation failed.
build engine done
yolov7_trt: /home/a/tensorrt_inference/yolov7/../includes/common/common.hpp:138: void onnxToTRTModel(const string&, const string&, nvinfer1::ICudaEngine*&, const int&): Assertion `engine' failed.
Aborted (core dumped)

use PyTorch 1.11 and onnx 1.12, the right shape of anchor should be 1x3x1x1x2

@Rohan-Python
Copy link

Hey can you please help me. I am facing issue when i am running inferencing on my system gpu the bounding boxes are not showing up when I am using gpu. And they are showing only when i am using cpu for inferencing (--device cpu). I am have the trained the yolov7 model on colab and am using the best.pt file as weights while inferencing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants