Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom data trained model export problem. #99

Open
xinsuinizhuan opened this issue Jul 12, 2022 · 25 comments
Open

custom data trained model export problem. #99

xinsuinizhuan opened this issue Jul 12, 2022 · 25 comments

Comments

@xinsuinizhuan
Copy link

xinsuinizhuan commented Jul 12, 2022

I use my custom data trained, then i use the pt to export the onnx model:
but the export onnx output is:
图片

have three sigmoid layers

but when i use the pretrained model export the onnx mode, the output is:
图片

so when i use the my custome trained model , C++ openvino and trt forward, the result is empty!

@xinsuinizhuan
Copy link
Author

@WongKinYiu how about this problem? The customed data trained model, export to onnx,output have three sigmoid layers, but the pretrained models no.

@WongKinYiu
Copy link
Owner

WongKinYiu commented Jul 13, 2022

Reparameterize the model before export may help.
https://github.com/WongKinYiu/yolov7#re-parameterization

@xinsuinizhuan
Copy link
Author

Reparameterize the model before export may help. https://github.com/WongKinYiu/yolov7#re-parameterization

OK! Let me have a try! how about YOLOv7-tiny reparameterization?

@xinsuinizhuan
Copy link
Author

MY yolov7 customed model, when use the yolov7 reparameterize, it break, error as:
File "E:\Item\Item_done\yolo\yolo5\yolov7\yolov7-main\YOLOv7_reparameterization.py", line 23, in
model.state_dict()['model.105.m.0.weight'].data[i, :, :, :] *= state_dict['model.105.im.0.implicit'].data[:, i, : :].squeeze()
IndexError: index 24 is out of bounds for dimension 0 with size 24

@xinsuinizhuan
Copy link
Author

Reparameterize the model before export may help. https://github.com/WongKinYiu/yolov7#re-parameterization

could solve this problem by export.py in code, export the onnx after reparameterize ? Because if could not export the onnx correctly,then dnn openvino tensorrt could not forward.

@xinsuinizhuan
Copy link
Author

when i use the yolov7.pt, the pretrained model, Reparameterize, it also break, erros as:
C:\ProgramData\Anaconda3\envs\yolov5\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2228.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Traceback (most recent call last):
File "E:\Item\Item_done\yolo\yolo5\yolov7\yolov7-main\YOLOv7_reparameterization.py", line 23, in
model.state_dict()['model.105.m.0.weight'].data[i, :, :, :] *= state_dict['model.105.im.0.implicit'].data[:, i, : :].squeeze()
KeyError: 'model.105.im.0.implicit'

@xinsuinizhuan
Copy link
Author

xinsuinizhuan commented Jul 13, 2022

@WongKinYiu Could help me, to solve this problem? I can't epxort the customed model corrrectly, so i can't forward with in my item.

@WongKinYiu
Copy link
Owner

255 in the script means (nc+5)*3, where nc is 80.
You have to change 255 to the correct value corresponding to your nc.

@sarmientoj24
Copy link

also getting the same thing

image

@sarmientoj24
Copy link

I am using a custom dataset of 2 classes

@xinsuinizhuan
Copy link
Author

@WongKinYiu Thank you. The model export is ok.But when i forward with the model, the result is empty, could not detect any objects. But when i use the model that no reparameterization, the onnxruntime forward is ok, but the openvino、dnn and trt is not work.After reparameterization, all of them no works, detect no any objects.

@sarmientoj24
Copy link

@WongKinYiu i am getting this stuff

    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from ckpts/yolov7.onnx failed:Node (Mul_664) Op (Mul) [ShapeInferenceError] Incompatible dimensions

@sarmientoj24
Copy link

@xinsuinizhuan how did you export your model? what is the command?

@sarmientoj24
Copy link

also, how were you able to have names there?

@xinsuinizhuan
Copy link
Author

@xinsuinizhuan how did you export your model? what is the command?

use the u5 branches, export as the yolov5. Then the onnxruntime forward is OK, But, other methods is not work.

@WongKinYiu
Copy link
Owner

#114 should also work.

@xinsuinizhuan
Copy link
Author

xinsuinizhuan commented Jul 14, 2022

#114 should also work.

@WongKinYiu Now, the problem is, customed data, train, get the model. I use the model to export the onnx by u5 branch, get the onnx model, i forward use the onnxruntime, it ok. but use the openvino and dnn and tensorrt, no objects detected, i compare with the pretrained model, it add three sigmoid layers. Then i use the https://github.com/WongKinYiu/yolov7#re-parameterization, to parameterization, then export the model by u5 brach, it no the three sigmoid layers, but all methods, onnxruntime/dnn/openvino/tensorrt, forward have no objectes detected.

@xinsuinizhuan
Copy link
Author

xinsuinizhuan commented Jul 14, 2022

#114 should also work.
Use the pretrained models also ok. But the customed data is still no work.
It also no work. It as the u5 branch export, the u5 brach exprot add the simplify. And also have the three sigmoid layers.

@xinsuinizhuan
Copy link
Author

How about yolov7-tiny to parameterization?

@PhucHau0312
Copy link

when i use the yolov7.pt, the pretrained model, Reparameterize, it also break, erros as: C:\ProgramData\Anaconda3\envs\yolov5\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2228.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Traceback (most recent call last): File "E:\Item\Item_done\yolo\yolo5\yolov7\yolov7-main\YOLOv7_reparameterization.py", line 23, in model.state_dict()['model.105.m.0.weight'].data[i, :, :, :] *= state_dict['model.105.im.0.implicit'].data[:, i, : :].squeeze() KeyError: 'model.105.im.0.implicit'

@xinsuinizhuan how did you fix this bug ? show me pls

@UUID81
Copy link

UUID81 commented Nov 24, 2022

Hi! i have no problem with reparametrization of yolov7-tiny with this source if code :

import

from copy import deepcopy
from models.yolo import Model
import torch
from utils.torch_utils import select_device, is_parallel
import yaml

device = select_device('CPU', batch_size=1)

ckpt = torch.load('C:/Users/morga/Documents/yolov7-train-prepross/yolov7-main/best.pt', map_location=device)

model = Model('cfg/deploy/yolov7-tiny.yaml', ch=3, nc=4).to(device)

with open('cfg/deploy/yolov7-tiny.yaml') as f:
yml = yaml.load(f, Loader=yaml.SafeLoader)
anchors = len(yml['anchors'][0]) // 2

state_dict = ckpt['model'].float().state_dict()
exclude = []
intersect_state_dict = {k: v for k, v in state_dict.items() if k in model.state_dict() and not any(x in k for x in exclude) and v.shape == model.state_dict()[k].shape}
model.load_state_dict(intersect_state_dict, strict=False)
model.names = ckpt['model'].names
model.nc = ckpt['model'].nc

for i in range((model.nc+5)*anchors):
model.state_dict()['model.77.m.0.weight'].data[i, :, :, :] *= state_dict['model.77.im.0.implicit'].data[:, i, : :].squeeze()
model.state_dict()['model.77.m.1.weight'].data[i, :, :, :] *= state_dict['model.77.im.1.implicit'].data[:, i, : :].squeeze()
model.state_dict()['model.77.m.2.weight'].data[i, :, :, :] *= state_dict['model.77.im.2.implicit'].data[:, i, : :].squeeze()
model.state_dict()['model.77.m.0.bias'].data += state_dict['model.77.m.0.weight'].mul(state_dict['model.77.ia.0.implicit']).sum(1).squeeze()
model.state_dict()['model.77.m.1.bias'].data += state_dict['model.77.m.1.weight'].mul(state_dict['model.77.ia.1.implicit']).sum(1).squeeze()
model.state_dict()['model.77.m.2.bias'].data += state_dict['model.77.m.2.weight'].mul(state_dict['model.77.ia.2.implicit']).sum(1).squeeze()
model.state_dict()['model.77.m.0.bias'].data *= state_dict['model.77.im.0.implicit'].data.squeeze()
model.state_dict()['model.77.m.1.bias'].data *= state_dict['model.77.im.1.implicit'].data.squeeze()
model.state_dict()['model.77.m.2.bias'].data *= state_dict['model.77.im.2.implicit'].data.squeeze()

ckpt = {'model': deepcopy(model.module if is_parallel(model) else model).half(),
'optimizer': None,
'training_results': None,
'epoch': -1}

torch.save(ckpt, 'cfg/deploy/yolov7tiny_re.pt')

Now i want to test my weights with detect.py , but that's crashed , no save results, nothing only that :

(PPV7) PS C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main> python detect.py --source C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\figure\mask.png --device cpu
Namespace(weights='C:/Users/morga/Documents/yolov7-train-prepross/yolov7-main/cfg/deploy/yolov7tiny_re.pt', source='C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\figure\mask.png', img_size=640, conf_thres=0.25, iou_thres=0.45, device='cpu', view_img=False, save_txt=False, save_conf=False, nosave=False, classes=None, agnostic_nms=False, augment=False, update=False, project='runs/detect', name='exp', exist_ok=False, no_trace=False)
YOLOR 2022-11-22 torch 1.13.0+cpu CPU

Fusing layers...
Model Summary: 200 layers, 6014737 parameters, 24273 gradients
Convert model to Traced-model...
traced_script_module saved!
model is traced!

Traceback (most recent call last):
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\detect.py", line 196, in
detect()
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\detect.py", line 88, in detect
pred = model(img, augment=opt.augment)[0]
File "C:\Users\morga\Documents\yolov7-train-prepross\PPV7\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\utils\torch_utils.py", line 373, in forward
out = self.detect_layer(out)
File "C:\Users\morga\Documents\yolov7-train-prepross\PPV7\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\models\yolo.py", line 53, in forward
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\models\yolo.py", line 81, in _make_grid
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
TypeError: cannot unpack non-iterable NoneType object

Today i'm on my personnal computer and it doesn't have CUDA computing, in detect.py line 31 :
image

code says : half = device.type != 'cpu' # half precision only supported on CUDA

that's why i have this ? :
return forward_call(*input, **kwargs)
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\models\yolo.py", line 53, in forward
self.grid[i] = self._make_grid(nx, ny).to(x[i].device)
File "C:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\models\yolo.py", line 81, in _make_grid
yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
TypeError: cannot unpack non-iterable NoneType object

after that i've modified yolo.py with yolov7-tiny.yaml and nc (for me nc=4) but nothing change! :-/

Please help if this error cannot unpack non-iterable NonType object talk to you or you remember this type of error in this case.

Same problem if i use vizualisation.py after outpout = model(image) i have an 'NoneType' object has no attribute 'shape'

with this before :
Cell In [9], line 2
1 image = cv2.imread('./yolov7-main/inference/images/bus.jpg') # 504x378 image
----> 2 image = letterbox(image, 1280, stride=64, auto=True)[0]
3 image_ = image.copy()
4 image = transforms.ToTensor()(image)

File c:\Users\morga\Documents\yolov7-train-prepross\yolov7-main\utils\datasets.py:986, in letterbox(img, new_shape, color, auto, scaleFill, scaleup, stride)
984 def letterbox(img, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
985 # Resize and pad image while meeting stride-multiple constraints
--> 986 shape = img.shape[:2] # current shape [height, width]
987 if isinstance(new_shape, int):
988 new_shape = (new_shape, new_shape)

why letterbox crash on zidane.jpg? can i visualize only if image size is 504 x 378 ??? what is this format/resolution????

Please give me a good way to understand what does it mean here?? Big thanks

@UUID81
Copy link

UUID81 commented Nov 24, 2022

i wanna said vizualisation.ipynb

@pradan7
Copy link

pradan7 commented Jun 23, 2023

The issue arises because the state_dict contains no entry/key with 'model.105.im.0.implicit' i.e. the im part

@UUID81
Copy link

UUID81 commented Jun 23, 2023 via email

@bbikdash
Copy link

I also had this problem. I would train model on custom data load it and export it to onnx. The ONNX model that was not reparametrized would produce bounding boxes and that one that was reparametrized would not produce bounding boxes.

Solution is to NOT fuse model layers for the reparametrization procedure (i.e. don't use attempt_load()). Just do something like this:

ckpt= torch.load("best.pt", map_location=device)
model = ckpt['model']

# Perform reparametrization
deploy = Model("cfg/deploy..."...)
deploy.load_state_dict(model.state_dict(), strict=False)
... 
# Perform rest of export as normal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants