Ultralytics YOLOV5 model to TRT conversion issue. #1

adrianosantospb · 2021-11-22T19:24:40Z

Hi,

I'm trying to use your implementation to convert an Ultralytics YOLOV5 model to TRT. I'm following the steps you wrote and I'm getting this error:

docker run --runtime nvidia -v ~/Documents/testmodelo/:/models/ --rm alxmamaev/jetson_yolov5_trt:latest trtexec --onnx=/models/01042021yolov5A.onnx --saveEngine=/models/01042021yolov5A.plan --fp16
&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # trtexec --onnx=/models/01042021yolov5A.onnx --saveEngine=/models/01042021yolov5A.plan --fp16
[11/22/2021-19:20:35] [I] === Model Options ===
[11/22/2021-19:20:35] [I] Format: ONNX
[11/22/2021-19:20:35] [I] Model: /models/01042021yolov5A.onnx
[11/22/2021-19:20:35] [I] Output:
[11/22/2021-19:20:35] [I] === Build Options ===
[11/22/2021-19:20:35] [I] Max batch: explicit
[11/22/2021-19:20:35] [I] Workspace: 16 MiB
[11/22/2021-19:20:35] [I] minTiming: 1
[11/22/2021-19:20:35] [I] avgTiming: 8
[11/22/2021-19:20:35] [I] Precision: FP32+FP16
[11/22/2021-19:20:35] [I] Calibration:
[11/22/2021-19:20:35] [I] Refit: Disabled
[11/22/2021-19:20:35] [I] Sparsity: Disabled
[11/22/2021-19:20:35] [I] Safe mode: Disabled
[11/22/2021-19:20:35] [I] Restricted mode: Disabled
[11/22/2021-19:20:35] [I] Save engine: /models/01042021yolov5A.plan
[11/22/2021-19:20:35] [I] Load engine:
[11/22/2021-19:20:35] [I] NVTX verbosity: 0
[11/22/2021-19:20:35] [I] Tactic sources: Using default tactic sources
[11/22/2021-19:20:35] [I] timingCacheMode: local
[11/22/2021-19:20:35] [I] timingCacheFile:
[11/22/2021-19:20:35] [I] Input(s)s format: fp32:CHW
[11/22/2021-19:20:35] [I] Output(s)s format: fp32:CHW
[11/22/2021-19:20:35] [I] Input build shapes: model
[11/22/2021-19:20:35] [I] Input calibration shapes: model
[11/22/2021-19:20:35] [I] === System Options ===
[11/22/2021-19:20:35] [I] Device: 0
[11/22/2021-19:20:35] [I] DLACore:
[11/22/2021-19:20:35] [I] Plugins:
[11/22/2021-19:20:35] [I] === Inference Options ===
[11/22/2021-19:20:35] [I] Batch: Explicit
[11/22/2021-19:20:35] [I] Input inference shapes: model
[11/22/2021-19:20:35] [I] Iterations: 10
[11/22/2021-19:20:35] [I] Duration: 3s (+ 200ms warm up)
[11/22/2021-19:20:35] [I] Sleep time: 0ms
[11/22/2021-19:20:35] [I] Streams: 1
[11/22/2021-19:20:35] [I] ExposeDMA: Disabled
[11/22/2021-19:20:35] [I] Data transfers: Enabled
[11/22/2021-19:20:35] [I] Spin-wait: Disabled
[11/22/2021-19:20:35] [I] Multithreading: Disabled
[11/22/2021-19:20:35] [I] CUDA Graph: Disabled
[11/22/2021-19:20:35] [I] Separate profiling: Disabled
[11/22/2021-19:20:35] [I] Time Deserialize: Disabled
[11/22/2021-19:20:35] [I] Time Refit: Disabled
[11/22/2021-19:20:35] [I] Skip inference: Disabled
[11/22/2021-19:20:35] [I] Inputs:
[11/22/2021-19:20:35] [I] === Reporting Options ===
[11/22/2021-19:20:35] [I] Verbose: Disabled
[11/22/2021-19:20:35] [I] Averages: 10 inferences
[11/22/2021-19:20:35] [I] Percentile: 99
[11/22/2021-19:20:35] [I] Dump refittable layers:Disabled
[11/22/2021-19:20:35] [I] Dump output: Disabled
[11/22/2021-19:20:35] [I] Profile: Disabled
[11/22/2021-19:20:35] [I] Export timing to JSON file:
[11/22/2021-19:20:35] [I] Export output to JSON file:
[11/22/2021-19:20:35] [I] Export profile to JSON file:
[11/22/2021-19:20:35] [I]
[11/22/2021-19:20:35] [I] === Device Information ===
[11/22/2021-19:20:35] [I] Selected Device: Xavier
[11/22/2021-19:20:35] [I] Compute Capability: 7.2
[11/22/2021-19:20:35] [I] SMs: 6
[11/22/2021-19:20:35] [I] Compute Clock Rate: 1.109 GHz
[11/22/2021-19:20:35] [I] Device Global Memory: 7773 MiB
[11/22/2021-19:20:35] [I] Shared Memory per SM: 96 KiB
[11/22/2021-19:20:35] [I] Memory Bus Width: 256 bits (ECC disabled)
[11/22/2021-19:20:35] [I] Memory Clock Rate: 1.109 GHz
[11/22/2021-19:20:35] [I]
[11/22/2021-19:20:35] [I] TensorRT version: 8001
[11/22/2021-19:20:37] [I] [TRT] [MemUsageChange] Init CUDA: CPU +353, GPU +0, now: CPU 371, GPU 6306 (MiB)
[11/22/2021-19:20:37] [I] Start parsing network model
[11/22/2021-19:20:38] [I] [TRT] ----------------------------------------------------------------
[11/22/2021-19:20:38] [I] [TRT] Input filename: /models/01042021yolov5A.onnx
[11/22/2021-19:20:38] [I] [TRT] ONNX IR version: 0.0.7
[11/22/2021-19:20:38] [I] [TRT] Opset version: 13
[11/22/2021-19:20:38] [I] [TRT] Producer name: pytorch
[11/22/2021-19:20:38] [I] [TRT] Producer version: 1.10
[11/22/2021-19:20:38] [I] [TRT] Domain:
[11/22/2021-19:20:38] [I] [TRT] Model version: 0
[11/22/2021-19:20:38] [I] [TRT] Doc string:
[11/22/2021-19:20:38] [I] [TRT] ----------------------------------------------------------------
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:39] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:40] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:41] [W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[W] [TRT] onnx2trt_utils.cpp:390: One or more weights outside the range of INT32 was clamped
[11/22/2021-19:20:41] [11/22/2021-19:20:41] [11/22/2021-19:20:41] [11/22/2021-19:20:41] [11/22/2021-19:20:41] [11/22/2021-19:20:41] [11/22/2021-19:20:41] [W] [TRT] Output type must be INT32 for shape outputs
[W] [TRT] Output type must be INT32 for shape outputs
[W] [TRT] Output type must be INT32 for shape outputs
[W] [TRT] Output type must be INT32 for shape outputs
[W] [TRT] Output type must be INT32 for shape outputs
[W] [TRT] Output type must be INT32 for shape outputs
[11/22/2021-19:20:41] [I] Finish parsing network model
[W] Dynamic dimensions required for input: images, but no shapes were provided. Automatically overriding shape to: 1x3x1x1
[11/22/2021-19:20:41] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 708, GPU 6947 (MiB)
[11/22/2021-19:20:41] [11/22/2021-19:20:41] [I] [TRT] [MemUsageSnapshot] Builder begin: CPU 708 MiB, GPU 6946 MiB
[11/22/2021-19:20:42] [11/22/2021-19:20:42] [E] Error[2]: [graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 2: Internal Error (Concat_40: dimensions not compatible for concatenation
Concat_40: dimensions not compatible for concatenation
Concat_40: dimensions not compatible for concatenation
Concat_40: dimensions not compatible for concatenation
)
[E] Error[2]: [builder.cpp::buildSerializedNetwork::417] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed.)

Do you know some information to help me with this issue?

Tks.

alxmamaev · 2021-11-29T07:51:03Z

@adrianosantospb Did you use on cell exporter from ultralytics?

Poulinakis-Konstantinos · 2021-11-29T09:00:24Z

Hello @alxmamaev, I also encounter the same issue. I have used ultralytics exporter to convert the pt model into ONNX.
Then I tried to use the command abo to convert ONNX->TRT and got the same exact error.
The model I used was the demo model yolov5s offered by ultralytics .

Any help would be greatly appreciated. Thanks

alxmamaev · 2021-11-29T09:05:25Z

@Poulinakis-Konstantinos I think may be it's happens with recent version of pytorch/yolov5. Can you check your model with yolov5 version at Aug 2021?

alxmamaev · 2021-12-02T17:16:46Z

@adrianosantospb @Poulinakis-Konstantinos I am not a creator of the convertor, I use trtexec util from this repo https://github.com/NVIDIA/TensorRT

I think, that may few reasons of problem.

The first is that some boxes post process operation is included into onnx file (like NMS os something like that) and TRTexec cannot to convert it.
In the new version of YoloV5 architecture may little bit changed and operation that is used (based on you log it is Concat_40) is not supported by this version of tensor RT.

You can firstly check model onnx graph using https://netron.app , if it is okay - you can try to downgrade yolov5 to more older commits or trying to build new version of trtexec, for that you can change version of TRTexec there

jetson_yolov5_tensorrt/Dockerfile

Line 18 in a39aeac

    
           RUN cd TensorRT && git checkout release/7.1 && git submodule update --init --recursive

Compiling take about 1hr on Jetson Nano.

If you have some troubles with building - you are welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ultralytics YOLOV5 model to TRT conversion issue. #1

Ultralytics YOLOV5 model to TRT conversion issue. #1

adrianosantospb commented Nov 22, 2021

alxmamaev commented Nov 29, 2021 •

edited

Loading

Poulinakis-Konstantinos commented Nov 29, 2021 •

edited

Loading

alxmamaev commented Nov 29, 2021

alxmamaev commented Dec 2, 2021 •

edited

Loading

Ultralytics YOLOV5 model to TRT conversion issue. #1

Ultralytics YOLOV5 model to TRT conversion issue. #1

Comments

adrianosantospb commented Nov 22, 2021

alxmamaev commented Nov 29, 2021 • edited Loading

Poulinakis-Konstantinos commented Nov 29, 2021 • edited Loading

alxmamaev commented Nov 29, 2021

alxmamaev commented Dec 2, 2021 • edited Loading

alxmamaev commented Nov 29, 2021 •

edited

Loading

Poulinakis-Konstantinos commented Nov 29, 2021 •

edited

Loading

alxmamaev commented Dec 2, 2021 •

edited

Loading