Exporting TensorRT (engine) with dynamic batch size failing #9688

adityasihag1804 · 2022-10-04T09:40:58Z

Search before asking

I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Export

Bug

I'm trying to export the pretrained yolov5 to engine format with dynamic shape using this command -

python export.py --weights yolov5m_coco.pt --include engine --device 0 --batch-size 256 --dynamic

But it's failing for some reason. Here's the output log.

export: data=data/coco128.yaml, weights=['yolov5m_coco.pt'], imgsz=[640, 640], batch_size=256, device=0, half=False, inplace=False, keras=False, optimize=False, int8=False, dynamic=True, simplify=False, opset=12, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['engine']
fatal: detected dubious ownership in repository at '/media/dev/aditya/yolov5'
To add an exception for this directory, call:

git config --global --add safe.directory /media/dev/aditya/yolov5

YOLOv5 🚀 2022-9-29 Python-3.7.13 torch-1.12.1+cu102 CUDA:0 (Quadro RTX 8000, 48601MiB)

Fusing layers...
YOLOv5m summary: 290 layers, 21172173 parameters, 0 gradients

PyTorch: starting from yolov5m_coco.pt with output shape (256, 25200, 85) (40.8 MB)

False starting export with onnx 1.12.0...
[W shape_type_inference.cpp:425] Warning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select) (function ComputeConstantFolding)
[W shape_type_inference.cpp:425] Warning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select) (function ComputeConstantFolding)
[W shape_type_inference.cpp:425] Warning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select) (function ComputeConstantFolding)
ONNX: export failure ❌ 4.1s: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

TensorRT: starting export with TensorRT 8.4.3.1...
[10/04/2022-15:08:23] [TRT] [I] [MemUsageChange] Init CUDA: CPU +285, GPU +0, now: CPU 2321, GPU 20679 (MiB)
[10/04/2022-15:08:24] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +206, GPU +70, now: CPU 2544, GPU 20749 (MiB)
export.py:270: DeprecationWarning: Use set_memory_pool_limit instead.
config.max_workspace_size = workspace * 1 << 30
[10/04/2022-15:08:24] [TRT] [I] ----------------------------------------------------------------
[10/04/2022-15:08:24] [TRT] [I] Input filename: yolov5m_coco.onnx
[10/04/2022-15:08:24] [TRT] [I] ONNX IR version: 0.0.7
[10/04/2022-15:08:24] [TRT] [I] Opset version: 12
[10/04/2022-15:08:24] [TRT] [I] Producer name: pytorch
[10/04/2022-15:08:24] [TRT] [I] Producer version: 1.12.1
[10/04/2022-15:08:24] [TRT] [I] Domain:
[10/04/2022-15:08:24] [TRT] [I] Model version: 0
[10/04/2022-15:08:24] [TRT] [I] Doc string:
[10/04/2022-15:08:24] [TRT] [I] ----------------------------------------------------------------
[10/04/2022-15:08:24] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
TensorRT: input "images" with shape(400, 3, 640, 640) DataType.FLOAT
TensorRT: output "output0" with shape(400, 25200, 85) DataType.FLOAT
TensorRT: building FP32 engine as yolov5m_coco.engine
export.py:297: DeprecationWarning: Use build_serialized_network instead.
with builder.build_engine(network, config) as engine, open(f, 'wb') as t:
[10/04/2022-15:08:24] [TRT] [E] 4: [network.cpp::operator()::3020] Error Code 4: Internal Error (images: kMIN dimensions in profile 0 are [1,3,640,640] but input has static dimensions [400,3,640,640].)
TensorRT: export failure ❌ 5.3s: enter

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2022-10-04T14:32:26Z

@adityasihag1804 good news 😃! Your original issue may now be fixed ✅ in PR #9691.

To receive this update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

adityasihag1804 added the bug Something isn't working label Oct 4, 2022

glenn-jocher mentioned this issue Oct 4, 2022

TensorRT --dynamic fix #9691

Merged

glenn-jocher closed this as completed in #9691 Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exporting TensorRT (engine) with dynamic batch size failing #9688

Exporting TensorRT (engine) with dynamic batch size failing #9688

adityasihag1804 commented Oct 4, 2022

glenn-jocher commented Oct 4, 2022 •

edited

Loading

Exporting TensorRT (engine) with dynamic batch size failing #9688

Exporting TensorRT (engine) with dynamic batch size failing #9688

Comments

adityasihag1804 commented Oct 4, 2022

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

glenn-jocher commented Oct 4, 2022 • edited Loading

glenn-jocher commented Oct 4, 2022 •

edited

Loading