Custom ONNX model load problem #800

hokwangchoi · 2020-11-17T10:43:07Z

Hello dusty! I am very much enjoying your tutorials and grateful for it.

I've looked through the issues but I couldn't find a relevant issue.

I am currently trying to load .onnx model (which is not from one of your tutorials) with detectNet on my custom C++ code. (jetson-inference is correctly included)
Using C++ detecNet class, I create a net :net = detectNet::Create(NULL, "path/to/model.onnx", 0.0f, "path/to/labels.txt");.
I intend to use the following Create function from source code:

	/**
	 * Load a custom network instance
	 * @param prototxt_path File path to the deployable network prototxt
	 * @param model_path File path to the caffemodel
	 * @param mean_pixel Input transform subtraction value (use 0.0 if the network already does this)
	 * @param class_labels File path to list of class name labels
	 * @param threshold default minimum threshold for detection
	 * @param input Name of the input layer blob.
	 * @param coverage Name of the output coverage classifier layer blob, which contains the confidence values for each bbox.
	 * @param bboxes Name of the output bounding box layer blob, which contains a grid of rectangles in the image.
	 * @param maxBatchSize The maximum batch size that the network will support and be optimized for.
	 */
	static detectNet* Create( const char* prototxt_path, const char* model_path, float mean_pixel=0.0f, 
						 const char* class_labels=NULL, float threshold=DETECTNET_DEFAULT_THRESHOLD, 
						 const char* input = DETECTNET_DEFAULT_INPUT, 
						 const char* coverage = DETECTNET_DEFAULT_COVERAGE, 
						 const char* bboxes = DETECTNET_DEFAULT_BBOX,
						 uint32_t maxBatchSize=DEFAULT_MAX_BATCH_SIZE, 
						 precisionType precision=TYPE_FASTEST,
				   		 deviceType device=DEVICE_GPU, bool allowGPUFallback=true );

The error occurs when the internal code is trying to read input_blobs.

[TRT]    INVALID_ARGUMENT: Cannot find binding of given name: data
[TRT]    failed to find requested input layer data in network
[TRT]    device GPU, failed to create resources for CUDA engine

I noticed that the default argument for input_blob doesn't work for external .onnx model.

Should I provide a correct input_blob argument to load the model? And how can I know this information?
Most of the conversion examples (anymodel to .onnx) don't provide this information, so I need some help on this with jetson-inference.

Looking forward to your ideas! Thanks :)

The text was updated successfully, but these errors were encountered:

murthax · 2020-11-18T05:18:01Z

I'm having the same problem with a model I converted from PyTorch Yolov5. I get:

[TRT] INVALID_ARGUMENT: Cannot find binding of given name: data
[TRT] failed to find requested input layer data in network
[TRT] device GPU, failed to create resources for CUDA engine
[TRT] failed to create TensorRT engine for models/test/best.onnx, device GPU
[TRT] detectNet -- failed to initialize.
detectnet: failed to load detectNet model

I tried a few other options I had from other models like --input-blob=input_0 but without knowing where to look for this I wasn't sure.

I am using the aarch64 Jetson Xavier NX.

hokwangchoi · 2020-11-18T10:41:17Z

@murthax I am trying the same model Pytorch Yolov5. Also, my platform is aarch64 Jetson Xavier AGX.

And I have found the input blob names and output names.
Could you try --input-blob=images? Your model can be visualized with a tool neutron

Mine looks like this:

However, when you export the model to onnx with export.py (from yolov5 source code), you will notice that there is only one output name which is 'output' since dry run will work from here

Then, I assign them as output_names=['classes', 'boxes']. This way jetson detectNet can load the model but I have further issues.

My current issue is that output binding doesn't seem to be correct which doesn't complain in the loading stage. It looks like following:

[TRT]    binding to input 0 images  binding index:  0
[TRT]    binding to input 0 images  dims (b=1 c=3 h=736 w=544) size=4804608
[TRT]    binding to output 0 classes  binding index:  1
[TRT]    binding to output 0 classes  dims (b=1 c=3 h=92 w=68) size=75072
[TRT]    binding to output 1 boxes  binding index:  2
[TRT]    binding to output 1 boxes  dims (b=1 c=3 h=46 w=34) size=18768

(I tweaked the input image size as well when I export the model.)

Then, when I try to run detection with this model, I get an illegal memory access error in CUDA.

detection start
[TRT]    engine.cpp (986) - Cuda Error in executeInternal: 700 (an illegal memory access was encountered)
[TRT]    FAILED_EXECUTION: std::exception
[TRT]    failed to execute TensorRT context on device GPU
detection end

The first and last print are my debugging messages.

berkantay · 2020-11-18T12:44:43Z

I am having the same issue for jetsontx2 with YOLOv5. I get exactly same error. Any update would be helpful.

When I change the --input-blob=images nothing changes on the error message.

hokwangchoi · 2020-11-18T13:11:48Z

@berkantay Could you check your model with neutron?

It seems like detection stage is not included in the onnx model conversion. This might help..

I am encountering two more issues now.

ONNX model loading outputs plugin issue when I include the detection stage in the model. (putting

https://github.com/ultralytics/yolov5/blob/a1c8406af3eac3e20d4dd5d327fd6cbd4fbb9752/models/export.py#L28

into False.)

[TRT]    builtin_op_importers.cpp:3659: Searching for plugin: ScatterND, plugin_version: 1, plugin_namespace: 
[TRT]    INVALID_ARGUMENT: getPluginCreator could not find plugin ScatterND version 1
ERROR: builtin_op_importers.cpp:3661 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

How can we make an output corresponding to the jetson detectNet? (including NMS)

berkantay · 2020-11-18T13:28:33Z

@berkantay Could you check your model with neutron?

It seems like detection stage is not included in the onnx model conversion. This might help..

I am encountering two more issues now.
1. ONNX model loading outputs plugin issue when I include the detection stage in the model. (putting
https://github.com/ultralytics/yolov5/blob/a1c8406af3eac3e20d4dd5d327fd6cbd4fbb9752/models/export.py#L28

into False.)
[TRT]    builtin_op_importers.cpp:3659: Searching for plugin: ScatterND, plugin_version: 1, plugin_namespace: 
[TRT]    INVALID_ARGUMENT: getPluginCreator could not find plugin ScatterND version 1
ERROR: builtin_op_importers.cpp:3661 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
1. How can we make an output corresponding to the jetson detectNet? (including NMS)

Hello @choi0330 when I check the model with neutron input layer is called images. And this is the output of the converted model. https://ibb.co/TTSQw5r

murthax · 2020-11-18T18:33:43Z

I'm at a similar stage. I did use Netron to interpret the ONNX file. I see some outputs. I was able to move past the input-blob issue by using "images" but then on to the next ones (--input-blob=images --output-cvg=? --output-bbox=?

If I use the outputs here ie. --input-blob=images --output-cvg=791 --output-bbox=771 the engine starts but I end up with memory errors:

[TRT] engine.cpp (986) - Cuda Error in executeInternal: 700 (an illegal memory access was encountered)
[TRT] FAILED_EXECUTION: std::exception
[TRT] failed to execute TensorRT context on device GPU
[OpenGL] glDisplay -- set the window size to 1280x720
[OpenGL] creating 1280x720 texture (GL_RGB8 format, 2764800 bytes)
[cuda] an illegal memory access was encountered (error 700) (hex 0x2BC)
[cuda] /home/magneto/jetson-inference/utils/display/glTexture.cpp:360
[cuda] an illegal memory access was encountered (error 700) (hex 0x2BC)
[cuda] /home/magneto/jetson-inference/build/aarch64/include/jetson-utils/cudaMappedMemory.h:51
RingBuffer -- failed to allocate zero-copy buffer of 1382400 bytes
[gstreamer] gstEncoder -- failed to allocate buffers (1382400 bytes each)
[cuda] an illegal memory access was encountered (error 700) (hex 0x2BC)
[cuda] /home/magneto/jetson-inference/build/aarch64/include/jetson-inference/tensorNet.h:685

dusty-nv · 2020-11-18T19:39:22Z

To support different object detection models in jetson-inference, you would need to add/modify the pre/post-processing code found here:

jetson-inference/c/detectNet.cpp

Line 728 in c8a0392

// downsample, convert to band-sequential RGB, and apply pixel normalization, mean pixel subtraction and standard deviation
jetson-inference/c/detectNet.cpp

Line 815 in c8a0392

float* conf = mOutputs[OUTPUT_CONF].CPU;

This should be made to match the pre/post-processing that gets performed on the original model.
It also seems like you might need to add support for a 3rd output layer - the previous detection models in jetson-inference used 2 output layers.

Since you are using PyTorch, you might also want to try the torch2trt project - https://github.com/nvidia-ai-iot/torch2trt

hokwangchoi · 2020-11-19T13:01:35Z

Thanks for the answer.

I would love to utilize Jentson-inference detectNet model, so I'll try to match the pre/post-processing of the original model for now.

berkantay · 2020-11-19T13:25:29Z

If you find out the solution it would be nice for us @choi0330. I will be also updating this issue in case of good updates.

berkantay · 2020-11-20T10:56:19Z

Hello everyone when I change the command with
detectnet --model=/home/user/Downloads/tyes.onnx --labels=/home/user/Downloads/a.txt --input-blob=input_0 --output-cvg=771 --output-bbox=791

I see no memory warnings, network works i think. But I get errors from gstreamer unsupported image format.

detectnet:  failed to capture video frame
[cuda]      unspecified launch failure (error 719) (hex 0x2CF)
[cuda]      /home/user/Projects/jetson-inference/utils/cuda/cudaYUV-NV12.cu:154
[cuda]      unspecified launch failure (error 719) (hex 0x2CF)
[cuda]      /home/user/Projects/jetson-inference/utils/cuda/cudaColorspace.cpp:42
[cuda]      unspecified launch failure (error 719) (hex 0x2CF)
[cuda]      /home/user/Projects/jetson-inference/utils/codec/gstDecoder.cpp:895
[gstreamer] gstDecoder::Capture() -- unsupported image format (rgb8)
[gstreamer]                          supported formats are:
[gstreamer]                              * rgb8
[gstreamer]                              * rgba8
[gstreamer]                              * rgb32f
[gstreamer]

Note:I Get this error mesage on every frame,sequentially

hokwangchoi · 2020-11-20T15:50:58Z

Here I found some hint on how to export the model with the detection layer: ultralytics/yolov5#708 (comment)

However, the default TensorRT doesn't support a Scatter ND operation which is in the detection layer. So, one possible option would be to implement the post-processing using three outputs (https://github.com/ultralytics/yolov5/blob/master/models/yolo.py#L52-L59) and NMS.

@murthax I share my current update.
@berkantay Please make sure your output layer --output-cvg=771 --output-bbox=791 are confidences and bounding boxes.

RGring · 2021-01-20T12:53:12Z

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

murthax · 2021-01-20T18:54:13Z

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

I wish I had better news but I was not able to move forward here. I've been using Yolov5 directly on my Jetson Xavier NX. The FPS isn't as good but at least I'm able to move forward with my project.

berkantay · 2021-01-21T14:14:40Z

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

Unfortunately I stopped my work on jetson-inference and implemented yolov5 itself without jetson-inference library on jetson tx2.

RGring · 2021-01-22T07:30:16Z

Ok thanks for your responses!

hokwangchoi · 2021-01-22T11:10:44Z

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

I went a bit further to implement detection layers in Jetson-inference library but it didn't work out. I guess it will be best to wait for the plugin of Scatter ND operation in TensorRT to run yolov5 with Jetson-inference.
In the end, I also run Yolov5 directly on Jetson for now.

dusty-nv · 2021-01-22T16:39:11Z

It looks like there are some TensorRT YOLOv5 projects out there:

https://github.com/SeanAvery/yolov5-tensorrt
https://www.google.com/search?q=tensorRT+yolov5

MichaelWU0726 · 2021-08-17T10:06:37Z

Here I found some hint on how to export the model with the detection layer: ultralytics/yolov5#708 (comment)

However, the default TensorRT doesn't support a Scatter ND operation which is in the detection layer. So, one possible option would be to implement the post-processing using three outputs (https://github.com/ultralytics/yolov5/blob/master/models/yolo.py#L52-L59) and NMS.

@murthax I share my current update.
@berkantay Please make sure your output layer --output-cvg=771 --output-bbox=791 are confidences and bounding boxes.

Operator scatterND has been supported in version 8, could you migrate the implementation of the plugin scatterPlugin to version 7 ? I have to use version 7.

berkantay mentioned this issue Nov 18, 2020

Detectron2 Integration #768

Closed

dusty-nv mentioned this issue Jan 20, 2021

Training MobileNet / YOLO models using PASCAL VOC datasets - no good result so far #895

Closed

dusty-nv closed this as completed Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom ONNX model load problem #800

Custom ONNX model load problem #800

hokwangchoi commented Nov 17, 2020 •

edited

Loading

murthax commented Nov 18, 2020

hokwangchoi commented Nov 18, 2020 •

edited

Loading

berkantay commented Nov 18, 2020 •

edited

Loading

hokwangchoi commented Nov 18, 2020 •

edited

Loading

berkantay commented Nov 18, 2020 •

edited

Loading

murthax commented Nov 18, 2020

dusty-nv commented Nov 18, 2020

hokwangchoi commented Nov 19, 2020 •

edited

Loading

berkantay commented Nov 19, 2020

berkantay commented Nov 20, 2020 •

edited

Loading

hokwangchoi commented Nov 20, 2020

RGring commented Jan 20, 2021

murthax commented Jan 20, 2021

berkantay commented Jan 21, 2021

RGring commented Jan 22, 2021

hokwangchoi commented Jan 22, 2021

dusty-nv commented Jan 22, 2021

MichaelWU0726 commented Aug 17, 2021 •

edited

Loading

Custom ONNX model load problem #800

Custom ONNX model load problem #800

Comments

hokwangchoi commented Nov 17, 2020 • edited Loading

murthax commented Nov 18, 2020

hokwangchoi commented Nov 18, 2020 • edited Loading

berkantay commented Nov 18, 2020 • edited Loading

hokwangchoi commented Nov 18, 2020 • edited Loading

berkantay commented Nov 18, 2020 • edited Loading

murthax commented Nov 18, 2020

dusty-nv commented Nov 18, 2020

hokwangchoi commented Nov 19, 2020 • edited Loading

berkantay commented Nov 19, 2020

berkantay commented Nov 20, 2020 • edited Loading

hokwangchoi commented Nov 20, 2020

RGring commented Jan 20, 2021

murthax commented Jan 20, 2021

berkantay commented Jan 21, 2021

RGring commented Jan 22, 2021

hokwangchoi commented Jan 22, 2021

dusty-nv commented Jan 22, 2021

MichaelWU0726 commented Aug 17, 2021 • edited Loading

hokwangchoi commented Nov 17, 2020 •

edited

Loading

hokwangchoi commented Nov 18, 2020 •

edited

Loading

berkantay commented Nov 18, 2020 •

edited

Loading

hokwangchoi commented Nov 18, 2020 •

edited

Loading

berkantay commented Nov 18, 2020 •

edited

Loading

hokwangchoi commented Nov 19, 2020 •

edited

Loading

berkantay commented Nov 20, 2020 •

edited

Loading

MichaelWU0726 commented Aug 17, 2021 •

edited

Loading