YOLOv5 in LibTorch produce different results #312

zherlock030 · 2020-07-06T15:13:11Z

🐛 Bug

A clear and concise description of what the bug is.

I followed https://gist.github.com/jakepoz/eb36163814a8f1b6ceb31e8addbba270 to derive the script model.

In my C++ code and my python code, I tested the same picture, I checked that the input tensors were the same after pre-processing of the picture, but the model output is different.

To Reproduce (REQUIRED)

the picture shape is (channel = 3, height = 360, width = 640)

python Input:

import cv2
img_path = 'test.png'
img = cv2.imread(img_path)
img = letterbox(img, new_shape = (640,640))[0]
img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(device).float()
img /= 255.0  # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
    img = img.unsqueeze(0)#shape(1,3,384,640)
pred = model(img, augment=False)
print(pred[0].shape)

Python Output:

torch.Size([1, 15120, 85])

C++ input

string img_path = "test.png";
  Mat img = imread(img_path);
  img = letterbox(img);//resize
  cvtColor(img, img, CV_BGR2RGB);// bgr->rgb
  img.convertTo(img, CV_32FC3, 1.0f / 255.0f);// 1/255
  auto tensor_img = torch::from_blob(img.data, {img.rows, img.cols, img.channels()});
  tensor_img = tensor_img.permute({2, 0, 1});
  tensor_img = tensor_img.unsqueeze(0);
  cout << "line 111, tensor size is " << tensor_img.sizes() << endl;//(1,3,384,640)
  
  std::vector<torch::jit::IValue> inputs;
  inputs.push_back(tensor_img);
  torch::jit::IValue output = model.forward(inputs);

  auto op = output.toList().get(0).toTensor();
 
  cout << "line 133, op[0] is " << op.sizes() << endl;

C++ output

output tensor shape  [1, 3, 48, 80, 85],
and 3*48*80 = 11520 != 15120

Expected behavior

A clear and concise description of what you expected to happen.

I would wish the model output in C++ will be the same as it in python.

Environment

OS: [Ubuntu]
CPU

The text was updated successfully, but these errors were encountered:

github-actions · 2020-07-06T15:13:47Z

Hello @zherlock030, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

Cloud-based AI systems operating on hundreds of HD video streams in realtime.
Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

zherlock030 · 2020-07-06T15:14:45Z

@jakepoz

glenn-jocher · 2020-07-06T19:30:43Z

We have updated export.py to support torchscript export now, among others. The tutorial is here: https://docs.ultralytics.com/yolov5/tutorials/model_export

Note that these are simple examples to get you started. Actual export and deployment (to an edge device) for example is a very complicated journey. We have not open sourced the entire process, but we do offer paid support in this area. If you have a business need let us know and we'd be happy to help you!

jakepoz · 2020-07-07T19:42:28Z

@zherlock030, this is because the final Detect layer in yolov5 is undoing the action of yolo's "anchor" system when in regular operations, but this is not being exported in the export script:

pjreddie/darknet#568

Unfortunately, I have not yet figured out the details here, it seems as if some of the variables like the self.anchors and self.anchor_grid are stored as registered parameters, but self.strides is not, and I have difficulty exporting the model with the anchor code turned on.

winself · 2020-07-23T03:58:05Z

@zherlock030 @jakepoz do you solve solve the problem? I meet the same. looking forward to your reply. thank you .

zherlock030 · 2020-07-24T09:00:19Z

@zherlock030 @jakepoz do you solve solve the problem? I meet the same. looking forward to your reply. thank you .

@winself
Think I have made it. Im not sure if I should open source it since @glenn-jocher have his concern.
I could share my code with u.

zherlock030 · 2020-07-24T09:03:31Z

@zherlock030, this is because the final Detect layer in yolov5 is undoing the action of yolo's "anchor" system when in regular operations, but this is not being exported in the export script:

pjreddie/darknet#568

Unfortunately, I have not yet figured out the details here, it seems as if some of the variables like the self.anchors and self.anchor_grid are stored as registered parameters, but self.strides is not, and I have difficulty exporting the model with the anchor code turned on.

@jakepoz
Thanks for your reply. I just treats self.strides as constants. And for now it produces reasonable results as it does in python.

zherlock030 · 2020-07-24T09:04:57Z

We have updated export.py to support torchscript export now, among others. The tutorial is here: #251

Note that these are simple examples to get you started. Actual export and deployment (to an edge device) for example is a very complicated journey. We have not open sourced the entire process, but we do offer paid support in this area. If you have a business need let us know and we'd be happy to help you!

Thanks for ur reply. Think I have made it, yolov5s is so fast.

glenn-jocher · 2020-07-24T17:36:01Z

@zherlock030 hi no worries about open sourcing your work! The only requirement is that you retain the current GPL3 license on modifications.

We eventually want to open source 100% of everything, including the export pipelines and the iDetection iOS app source code. We are trying to adjust our business model to make this happen either later this year or next year.

winself · 2020-07-27T02:53:57Z

@zherlock030 Thanks for ur reply!!! " self.training |= self.export" cause this results. when export is True -> training is True. so the Torchscript product the training output. we need write some code to process the result . Is this right ?

zherlock030 · 2020-07-27T11:37:15Z

@zherlock030 Thanks for ur reply!!! " self.training |= self.export" cause this results. when export is True -> training is True. so the Torchscript product the training output. we need write some code to process the result . Is this right ?

yeah, we need to write code for image preprocess, detect layer and nms.
U can see my implementation in https://github.com/zherlock030/YOLOv5_Torchscript.

phamdat09 · 2020-07-28T14:01:48Z

Hello,
I also interested in running Yolov5 in C++. @zherlock030 when you run yolov5, how many GB of GPU do you use? Is it lower than running in python?
Thanks

easycome2009 · 2020-07-30T08:30:04Z

@zherlock030,Hi!! It's similar to you that I also write the nms.cpp code. I used the official export.py to export the torchscript files,the output op is a tensor of [1,gridx, gridy, 9], however, the 9 vector is totally wrong. Does the exported torchscript files is not right?? Do I need any modificaton? Because I find the warnning words :TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. in the torch.jit.trace

yasenh · 2020-07-30T19:53:37Z

@easycome2009 what I did is set "model.model[-1].export = False" in exprot.py line#28, I get similar result from python and c++

zherlock030 · 2020-08-05T15:24:09Z

@phamdat09 hi, actually Im using a CPU.

zherlock030 · 2020-08-05T15:32:55Z

@easycome2009 yes, when u run export.py, u need to modify the detect layer, let it just output the imputed list 'x', and then implement detect layer in ur c++ code.

zherlock030 · 2020-08-05T15:34:04Z

@yasenh yes, I tried that too, that way we can't feed the network pictures in different shapes.

yasenh · 2020-08-05T15:37:02Z

@zherlock030, here is my implementation just FYI: https://github.com/yasenh/libtorch-yolov5
The image will be padded to fix size e.g (640, 640)

zherlock030 · 2020-08-05T15:40:21Z

@yasenh yeah I know what u mean, but actually with function letterbox, image in any shape can be feeded to yolo.

yasenh · 2020-08-05T15:48:23Z

@yasenh yeah I know what u mean, but actually with function letterbox, image in any shape can be feeded to yolo.

I think you can still do that, but I think the benefit of padding images to same size is that we can process images as a batch. Otherwise you might need to process images with different sizes one by one.

github-actions · 2020-09-05T01:06:43Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

FantasyJXF · 2020-11-10T10:10:42Z

C++ output

output tensor shape  [1, 3, 48, 80, 85],
and 3*48*80 = 11520 != 15120

That's because the c++ output is a list [(1, 3, height / 8, width / 8, 6), (1, 3, height / 16, width / 16, 6), (1, 3, height / 16, width / 16, 6)], while the python output is a tuple ([1, num_anchors, 6], [(1, 3, height / 8, width / 8, 6), (1, 3, height / 16, width / 16, 6), (1, 3, height / 16, width / 16, 6)]).

In your case: 3*(48*80+24*40+12*20) == 15120

MHGL · 2021-07-23T10:15:47Z

our team rewrite yolov5l model, now it can convert to (script, onnx, coreml, ncnn, tnn, mnn, tensorrt), please refer to yolov5

zherlock030 added the bug Something isn't working label Jul 6, 2020

glenn-jocher removed the bug Something isn't working label Jul 6, 2020

github-actions bot added the Stale label Sep 5, 2020

github-actions bot closed this as completed Sep 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YOLOv5 in LibTorch produce different results #312

YOLOv5 in LibTorch produce different results #312

zherlock030 commented Jul 6, 2020

github-actions bot commented Jul 6, 2020 •

edited by glenn-jocher

Loading

zherlock030 commented Jul 6, 2020

glenn-jocher commented Jul 6, 2020 •

edited

Loading

jakepoz commented Jul 7, 2020 •

edited

Loading

winself commented Jul 23, 2020

zherlock030 commented Jul 24, 2020 •

edited

Loading

zherlock030 commented Jul 24, 2020 •

edited

Loading

zherlock030 commented Jul 24, 2020

glenn-jocher commented Jul 24, 2020

winself commented Jul 27, 2020

zherlock030 commented Jul 27, 2020

phamdat09 commented Jul 28, 2020

easycome2009 commented Jul 30, 2020

yasenh commented Jul 30, 2020

zherlock030 commented Aug 5, 2020

zherlock030 commented Aug 5, 2020

zherlock030 commented Aug 5, 2020

yasenh commented Aug 5, 2020

zherlock030 commented Aug 5, 2020

yasenh commented Aug 5, 2020

github-actions bot commented Sep 5, 2020

FantasyJXF commented Nov 10, 2020

MHGL commented Jul 23, 2021

YOLOv5 in LibTorch produce different results #312

YOLOv5 in LibTorch produce different results #312

Comments

zherlock030 commented Jul 6, 2020

🐛 Bug

To Reproduce (REQUIRED)

Expected behavior

Environment

github-actions bot commented Jul 6, 2020 • edited by glenn-jocher Loading

zherlock030 commented Jul 6, 2020

glenn-jocher commented Jul 6, 2020 • edited Loading

jakepoz commented Jul 7, 2020 • edited Loading

winself commented Jul 23, 2020

zherlock030 commented Jul 24, 2020 • edited Loading

zherlock030 commented Jul 24, 2020 • edited Loading

zherlock030 commented Jul 24, 2020

glenn-jocher commented Jul 24, 2020

winself commented Jul 27, 2020

zherlock030 commented Jul 27, 2020

phamdat09 commented Jul 28, 2020

easycome2009 commented Jul 30, 2020

yasenh commented Jul 30, 2020

zherlock030 commented Aug 5, 2020

zherlock030 commented Aug 5, 2020

zherlock030 commented Aug 5, 2020

yasenh commented Aug 5, 2020

zherlock030 commented Aug 5, 2020

yasenh commented Aug 5, 2020

github-actions bot commented Sep 5, 2020

FantasyJXF commented Nov 10, 2020

MHGL commented Jul 23, 2021

github-actions bot commented Jul 6, 2020 •

edited by glenn-jocher

Loading

glenn-jocher commented Jul 6, 2020 •

edited

Loading

jakepoz commented Jul 7, 2020 •

edited

Loading

zherlock030 commented Jul 24, 2020 •

edited

Loading

zherlock030 commented Jul 24, 2020 •

edited

Loading