ONNX export with GPU (--device opt) is not working #4159

SamSamhuns · 2021-07-26T07:38:29Z

🐛 Bug

I get an Expected all tensors to be on the same device error when using $ python export.py --weight yolov5m.pt --include onnx --device 0

To Reproduce (REQUIRED)

Input:

# download yolov5m.pt model first
$ python export.py --weight yolov5m.pt --include onnx --device 0

Output:

Traceback (most recent call last):
  File "export.py", line 54, in export_onnx
    torch.onnx.export(model, img, f, verbose=False, opset_version=11,
  File "/home/sam/human_body_proportion_estimation/yolov5/venv/lib/python3.8/site-packages/torch/onnx/__init__.py", line 275, in export

    return utils.export(model, args, f, export_params, verbose, training,
  File "/home/sam/human_body_proportion_estimation/yolov5/venv/lib/python3.8/site-packages/torch/onnx/utils.py", line 88, in export
    _export(model, args, f, export_params, verbose, training, input_names, output_names,
  File "/home/sam/human_body_proportion_estimation/yolov5/venv/lib/python3.8/site-packages/torch/onnx/utils.py", line 689, in _export
    _model_to_graph(model, args, verbose, input_names,
  File "/home/sam/human_body_proportion_estimation/yolov5/venv/lib/python3.8/site-packages/torch/onnx/utils.py", line 501, in _model_to_graph
    params_dict = torch._C._jit_pass_onnx_constant_fold(graph, params_dict,

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking arugment for argument index in method wrapper_index_select)

Expected behavior

Successful onnx export

Environment

If applicable, add screenshots to help explain your problem.

OS: [Ubuntu]
GPU [Tesla V100]

Additional context

I also checked if the model and the img passed to the onnx export are present in the same cuda device, which they were.

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2021-07-26T11:39:27Z

@SamSamhuns yes this has been reported before. ONNX models must be exported on CPU device for now. If you determine the cause of the issue please submit a PR to help other users, thank you!

SamSamhuns · 2021-07-26T13:23:43Z

Found a way to avoid generating an error while exporting with GPU but not sure whether it's worth a PR @glenn-jocher

It seems the error occurs when using the do_constant_folding parameter with the onnx export call, which leads to an issue in line 501 in torch.onnx.utils.py params_dict = torch._C._jit_pass_onnx_constant_fold(graph, params_dict,_export_onnx_opset_version).

Unfortunately, even after verifying that each model parameter was on the correct cuda device, the error still persists.

However, GPU export is possible with the following but disabling constant_folding might cause computational penalties

torch.onnx.export(model, img, f, verbose=False, opset_version=opset,
                  training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
                  do_constant_folding=(not train) and (not next(model.parameters()).is_cuda),  # Additional check if cuda used
                  input_names=['images'],
                  output_names=['output'],
                  dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'},  # shape(1,3,640,640)
                                'output': {0: 'batch', 1: 'anchors'}  # shape(1,25200,85)
                                    } if dynamic else None)

Note: Setting model.model[-1].export = True as I saw in some issues did not solve the issue either.

glenn-jocher · 2021-07-26T13:25:48Z

@SamSamhuns can you quantify the penalty, i.e. extra layers or difference in parameters between the two export methods, or profiling results when running python detect.py --weights yolov5s.onnx?

SamSamhuns · 2021-07-27T12:23:48Z

It seems in this case, the onnx model regardless of export with cpu/gpu have the same performance in terms of accuracy and speed on a cursory glance.

So that additional (not next(model.parameters()).is_cuda) can be a added a temporary check to avoid errors but should not be a long term solution.

ONNX cpu export	ONNX gpu export

However, when using the useful --half option when doing the onnx export with gpu, the export is complete but the half model fails on inference, unfortunately.

Traceback (most recent call last):
  File "detect.py", line 243, in <module>
    main(opt)

  File "detect.py", line 238, in main
    run(**vars(opt))
  File "/home/sam/yolov5/venv/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "detect.py", line 82, in run
    session = onnxruntime.InferenceSession(w, None)
  File "/home/sam/yolov5/venv/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/sam/yolov5/venv/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from yolov5s_gpu_half.onnx failed:Type Error: Type parameter (T) of Optype (Concat) b
ound to different types (tensor(float) and tensor(float16) in node (Concat_540).

For some reason, a node has float32 format despite the model being exported in float16. So there are still some issues with GPU onnx export

glenn-jocher · 2021-07-27T13:25:17Z

@SamSamhuns there's really no reason to export on GPU other than to produce an FP16 model. FP16 models don't run in PyTorch on CPU as the PyTorch backend instruction sets are not capable of handling this, I don't know about ONNX.

SamSamhuns · 2021-07-27T13:51:33Z

Makes sense, anyway there is some underlying issue in ONNX or pytorch onnx export that is causing this.

github-actions · 2021-08-27T00:10:58Z

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Wiki – https://github.com/ultralytics/yolov5/wiki
Tutorials – https://docs.ultralytics.com/yolov5
Docs – https://docs.ultralytics.com

Access additional Ultralytics ⚡ resources:

Ultralytics HUB – https://ultralytics.com
Vision API – https://ultralytics.com/yolov5
About Us – https://ultralytics.com/about
Join Our Team – https://ultralytics.com/work
Contact Us – https://ultralytics.com/contact

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

LaserLV52 · 2021-09-17T02:40:09Z

It seems in this case, the onnx model regardless of export with cpu/gpu have the same performance in terms of accuracy and speed on a cursory glance.

So that additional (not next(model.parameters()).is_cuda) can be a added a temporary check to avoid errors but should not be a long term solution.

ONNX cpu export ONNX gpu export

However, when using the useful --half option when doing the onnx export with gpu, the export is complete but the half model fails on inference, unfortunately.
Traceback (most recent call last):
  File "detect.py", line 243, in <module>
    main(opt)

  File "detect.py", line 238, in main
    run(**vars(opt))
  File "/home/sam/yolov5/venv/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "detect.py", line 82, in run
    session = onnxruntime.InferenceSession(w, None)
  File "/home/sam/yolov5/venv/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/sam/yolov5/venv/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from yolov5s_gpu_half.onnx failed:Type Error: Type parameter (T) of Optype (Concat) b
ound to different types (tensor(float) and tensor(float16) in node (Concat_540).
For some reason, a node has float32 format despite the model being exported in float16. So there are still some issues with GPU onnx export

Hello, I follow your solution by changed the do_constant_folding= not train to do_constant_folding=(not train) and (not next(model.parameters()).is_cuda). Then, I use the --device 0 to export the onnx, which didn't arise the error. But, I used the onnx file in detect.py, I found that the model is still work on CPU not GPU, do you have any idea? Thanks!

glenn-jocher · 2021-10-11T18:28:37Z

@SamSamhuns @LaserLV52 good news 😃! Your original issue may now be fixed ✅ in PR #5110 by @SamFC10. This PR implements backend-device change improvements to allow for YOLOv5 models to be exportedto ONNX on either GPU or CPU, and to export at FP16 with the --half flag on GPU --device 0.

To receive this update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload with model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

SamSamhuns added the bug Something isn't working label Jul 26, 2021

github-actions bot added the Stale label Aug 27, 2021

github-actions bot closed this as completed Sep 2, 2021

glenn-jocher linked a pull request Oct 11, 2021 that will close this issue

Fix different devices bug when moving model from GPU to CPU #5110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX export with GPU (--device opt) is not working #4159

ONNX export with GPU (--device opt) is not working #4159

SamSamhuns commented Jul 26, 2021

glenn-jocher commented Jul 26, 2021 •

edited

Loading

SamSamhuns commented Jul 26, 2021

glenn-jocher commented Jul 26, 2021

SamSamhuns commented Jul 27, 2021

glenn-jocher commented Jul 27, 2021

SamSamhuns commented Jul 27, 2021

github-actions bot commented Aug 27, 2021 •

edited by glenn-jocher

Loading

LaserLV52 commented Sep 17, 2021

glenn-jocher commented Oct 11, 2021

ONNX export with GPU (--device opt) is not working #4159

ONNX export with GPU (--device opt) is not working #4159

Comments

SamSamhuns commented Jul 26, 2021

🐛 Bug

To Reproduce (REQUIRED)

Expected behavior

Environment

Additional context

glenn-jocher commented Jul 26, 2021 • edited Loading

SamSamhuns commented Jul 26, 2021

glenn-jocher commented Jul 26, 2021

SamSamhuns commented Jul 27, 2021

glenn-jocher commented Jul 27, 2021

SamSamhuns commented Jul 27, 2021

github-actions bot commented Aug 27, 2021 • edited by glenn-jocher Loading

LaserLV52 commented Sep 17, 2021

glenn-jocher commented Oct 11, 2021

glenn-jocher commented Jul 26, 2021 •

edited

Loading

github-actions bot commented Aug 27, 2021 •

edited by glenn-jocher

Loading