Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom ONNX model load problem #800

Closed
hokwangchoi opened this issue Nov 17, 2020 · 18 comments
Closed

Custom ONNX model load problem #800

hokwangchoi opened this issue Nov 17, 2020 · 18 comments

Comments

@hokwangchoi
Copy link

hokwangchoi commented Nov 17, 2020

Hello dusty! I am very much enjoying your tutorials and grateful for it.

I've looked through the issues but I couldn't find a relevant issue.

I am currently trying to load .onnx model (which is not from one of your tutorials) with detectNet on my custom C++ code. (jetson-inference is correctly included)
Using C++ detecNet class, I create a net :net = detectNet::Create(NULL, "path/to/model.onnx", 0.0f, "path/to/labels.txt");.
I intend to use the following Create function from source code:

	/**
	 * Load a custom network instance
	 * @param prototxt_path File path to the deployable network prototxt
	 * @param model_path File path to the caffemodel
	 * @param mean_pixel Input transform subtraction value (use 0.0 if the network already does this)
	 * @param class_labels File path to list of class name labels
	 * @param threshold default minimum threshold for detection
	 * @param input Name of the input layer blob.
	 * @param coverage Name of the output coverage classifier layer blob, which contains the confidence values for each bbox.
	 * @param bboxes Name of the output bounding box layer blob, which contains a grid of rectangles in the image.
	 * @param maxBatchSize The maximum batch size that the network will support and be optimized for.
	 */
	static detectNet* Create( const char* prototxt_path, const char* model_path, float mean_pixel=0.0f, 
						 const char* class_labels=NULL, float threshold=DETECTNET_DEFAULT_THRESHOLD, 
						 const char* input = DETECTNET_DEFAULT_INPUT, 
						 const char* coverage = DETECTNET_DEFAULT_COVERAGE, 
						 const char* bboxes = DETECTNET_DEFAULT_BBOX,
						 uint32_t maxBatchSize=DEFAULT_MAX_BATCH_SIZE, 
						 precisionType precision=TYPE_FASTEST,
				   		 deviceType device=DEVICE_GPU, bool allowGPUFallback=true );

The error occurs when the internal code is trying to read input_blobs.

[TRT]    INVALID_ARGUMENT: Cannot find binding of given name: data
[TRT]    failed to find requested input layer data in network
[TRT]    device GPU, failed to create resources for CUDA engine

I noticed that the default argument for input_blob doesn't work for external .onnx model.

Should I provide a correct input_blob argument to load the model? And how can I know this information?
Most of the conversion examples (anymodel to .onnx) don't provide this information, so I need some help on this with jetson-inference.

Looking forward to your ideas! Thanks :)

@murthax
Copy link

murthax commented Nov 18, 2020

I'm having the same problem with a model I converted from PyTorch Yolov5. I get:

[TRT] INVALID_ARGUMENT: Cannot find binding of given name: data
[TRT] failed to find requested input layer data in network
[TRT] device GPU, failed to create resources for CUDA engine
[TRT] failed to create TensorRT engine for models/test/best.onnx, device GPU
[TRT] detectNet -- failed to initialize.
detectnet: failed to load detectNet model

I tried a few other options I had from other models like --input-blob=input_0 but without knowing where to look for this I wasn't sure.

I am using the aarch64 Jetson Xavier NX.

@hokwangchoi
Copy link
Author

hokwangchoi commented Nov 18, 2020

@murthax I am trying the same model Pytorch Yolov5. Also, my platform is aarch64 Jetson Xavier AGX.

And I have found the input blob names and output names.
Could you try --input-blob=images? Your model can be visualized with a tool neutron

Mine looks like this:
Screenshot from 2020-11-18 11-59-25

However, when you export the model to onnx with export.py (from yolov5 source code), you will notice that there is only one output name which is 'output' since dry run will work from here

Then, I assign them as output_names=['classes', 'boxes']. This way jetson detectNet can load the model but I have further issues.

My current issue is that output binding doesn't seem to be correct which doesn't complain in the loading stage. It looks like following:

[TRT]    binding to input 0 images  binding index:  0
[TRT]    binding to input 0 images  dims (b=1 c=3 h=736 w=544) size=4804608
[TRT]    binding to output 0 classes  binding index:  1
[TRT]    binding to output 0 classes  dims (b=1 c=3 h=92 w=68) size=75072
[TRT]    binding to output 1 boxes  binding index:  2
[TRT]    binding to output 1 boxes  dims (b=1 c=3 h=46 w=34) size=18768

(I tweaked the input image size as well when I export the model.)

Then, when I try to run detection with this model, I get an illegal memory access error in CUDA.

detection start
[TRT]    engine.cpp (986) - Cuda Error in executeInternal: 700 (an illegal memory access was encountered)
[TRT]    FAILED_EXECUTION: std::exception
[TRT]    failed to execute TensorRT context on device GPU
detection end

The first and last print are my debugging messages.

@berkantay
Copy link

berkantay commented Nov 18, 2020

I am having the same issue for jetsontx2 with YOLOv5. I get exactly same error. Any update would be helpful.

When I change the --input-blob=images nothing changes on the error message.

@hokwangchoi
Copy link
Author

hokwangchoi commented Nov 18, 2020

@berkantay Could you check your model with neutron?

It seems like detection stage is not included in the onnx model conversion. This might help..

I am encountering two more issues now.

  1. ONNX model loading outputs plugin issue when I include the detection stage in the model. (putting

https://github.com/ultralytics/yolov5/blob/a1c8406af3eac3e20d4dd5d327fd6cbd4fbb9752/models/export.py#L28

into False.)

[TRT]    builtin_op_importers.cpp:3659: Searching for plugin: ScatterND, plugin_version: 1, plugin_namespace: 
[TRT]    INVALID_ARGUMENT: getPluginCreator could not find plugin ScatterND version 1
ERROR: builtin_op_importers.cpp:3661 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
  1. How can we make an output corresponding to the jetson detectNet? (including NMS)

@berkantay
Copy link

berkantay commented Nov 18, 2020

@berkantay Could you check your model with neutron?

It seems like detection stage is not included in the onnx model conversion. This might help..

I am encountering two more issues now.

1. ONNX model loading outputs plugin issue when I include the detection stage in the model. (putting

https://github.com/ultralytics/yolov5/blob/a1c8406af3eac3e20d4dd5d327fd6cbd4fbb9752/models/export.py#L28

into False.)

[TRT]    builtin_op_importers.cpp:3659: Searching for plugin: ScatterND, plugin_version: 1, plugin_namespace: 
[TRT]    INVALID_ARGUMENT: getPluginCreator could not find plugin ScatterND version 1
ERROR: builtin_op_importers.cpp:3661 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
1. How can we make an output corresponding to the jetson detectNet? (including NMS)

Hello @choi0330 when I check the model with neutron input layer is called images. And this is the output of the converted model. https://ibb.co/TTSQw5r

@murthax
Copy link

murthax commented Nov 18, 2020

I'm at a similar stage. I did use Netron to interpret the ONNX file. I see some outputs. I was able to move past the input-blob issue by using "images" but then on to the next ones (--input-blob=images --output-cvg=? --output-bbox=?

image

If I use the outputs here ie. --input-blob=images --output-cvg=791 --output-bbox=771 the engine starts but I end up with memory errors:

[TRT] engine.cpp (986) - Cuda Error in executeInternal: 700 (an illegal memory access was encountered)
[TRT] FAILED_EXECUTION: std::exception
[TRT] failed to execute TensorRT context on device GPU
[OpenGL] glDisplay -- set the window size to 1280x720
[OpenGL] creating 1280x720 texture (GL_RGB8 format, 2764800 bytes)
[cuda] an illegal memory access was encountered (error 700) (hex 0x2BC)
[cuda] /home/magneto/jetson-inference/utils/display/glTexture.cpp:360
[cuda] an illegal memory access was encountered (error 700) (hex 0x2BC)
[cuda] /home/magneto/jetson-inference/build/aarch64/include/jetson-utils/cudaMappedMemory.h:51
RingBuffer -- failed to allocate zero-copy buffer of 1382400 bytes
[gstreamer] gstEncoder -- failed to allocate buffers (1382400 bytes each)
[cuda] an illegal memory access was encountered (error 700) (hex 0x2BC)
[cuda] /home/magneto/jetson-inference/build/aarch64/include/jetson-inference/tensorNet.h:685

@dusty-nv
Copy link
Owner

To support different object detection models in jetson-inference, you would need to add/modify the pre/post-processing code found here:

This should be made to match the pre/post-processing that gets performed on the original model.
It also seems like you might need to add support for a 3rd output layer - the previous detection models in jetson-inference used 2 output layers.

Since you are using PyTorch, you might also want to try the torch2trt project - https://github.com/nvidia-ai-iot/torch2trt

@hokwangchoi
Copy link
Author

hokwangchoi commented Nov 19, 2020

Thanks for the answer.

I would love to utilize Jentson-inference detectNet model, so I'll try to match the pre/post-processing of the original model for now.

@berkantay
Copy link

If you find out the solution it would be nice for us @choi0330. I will be also updating this issue in case of good updates.

@berkantay
Copy link

berkantay commented Nov 20, 2020

Hello everyone when I change the command with
detectnet --model=/home/user/Downloads/tyes.onnx --labels=/home/user/Downloads/a.txt --input-blob=input_0 --output-cvg=771 --output-bbox=791

I see no memory warnings, network works i think. But I get errors from gstreamer unsupported image format.

detectnet:  failed to capture video frame
[cuda]      unspecified launch failure (error 719) (hex 0x2CF)
[cuda]      /home/user/Projects/jetson-inference/utils/cuda/cudaYUV-NV12.cu:154
[cuda]      unspecified launch failure (error 719) (hex 0x2CF)
[cuda]      /home/user/Projects/jetson-inference/utils/cuda/cudaColorspace.cpp:42
[cuda]      unspecified launch failure (error 719) (hex 0x2CF)
[cuda]      /home/user/Projects/jetson-inference/utils/codec/gstDecoder.cpp:895
[gstreamer] gstDecoder::Capture() -- unsupported image format (rgb8)
[gstreamer]                          supported formats are:
[gstreamer]                              * rgb8
[gstreamer]                              * rgba8
[gstreamer]                              * rgb32f
[gstreamer]                    

Note:I Get this error mesage on every frame,sequentially

@hokwangchoi
Copy link
Author

Here I found some hint on how to export the model with the detection layer: ultralytics/yolov5#708 (comment)

However, the default TensorRT doesn't support a Scatter ND operation which is in the detection layer. So, one possible option would be to implement the post-processing using three outputs (https://github.com/ultralytics/yolov5/blob/master/models/yolo.py#L52-L59) and NMS.

@murthax I share my current update.
@berkantay Please make sure your output layer --output-cvg=771 --output-bbox=791 are confidences and bounding boxes.

@RGring
Copy link

RGring commented Jan 20, 2021

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

@murthax
Copy link

murthax commented Jan 20, 2021

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

I wish I had better news but I was not able to move forward here. I've been using Yolov5 directly on my Jetson Xavier NX. The FPS isn't as good but at least I'm able to move forward with my project.

@berkantay
Copy link

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

Unfortunately I stopped my work on jetson-inference and implemented yolov5 itself without jetson-inference library on jetson tx2.

@RGring
Copy link

RGring commented Jan 22, 2021

Ok thanks for your responses!

@hokwangchoi
Copy link
Author

Hi @choi0330 @berkantay,
I am also interested in using ultralytics yolov5 models within the jetson-inference framework. Did you get it working? I would appreciate your help!

I went a bit further to implement detection layers in Jetson-inference library but it didn't work out. I guess it will be best to wait for the plugin of Scatter ND operation in TensorRT to run yolov5 with Jetson-inference.
In the end, I also run Yolov5 directly on Jetson for now.

@dusty-nv
Copy link
Owner

It looks like there are some TensorRT YOLOv5 projects out there:

https://github.com/SeanAvery/yolov5-tensorrt
https://www.google.com/search?q=tensorRT+yolov5

@MichaelWU0726
Copy link

MichaelWU0726 commented Aug 17, 2021

Here I found some hint on how to export the model with the detection layer: ultralytics/yolov5#708 (comment)

However, the default TensorRT doesn't support a Scatter ND operation which is in the detection layer. So, one possible option would be to implement the post-processing using three outputs (https://github.com/ultralytics/yolov5/blob/master/models/yolo.py#L52-L59) and NMS.

@murthax I share my current update.
@berkantay Please make sure your output layer --output-cvg=771 --output-bbox=791 are confidences and bounding boxes.

Operator scatterND has been supported in version 8, could you migrate the implementation of the plugin scatterPlugin to version 7 ? I have to use version 7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants