Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Infer from a trained custom model #10180

Closed
1 task done
hey24sheep opened this issue Nov 17, 2022 · 7 comments · Fixed by #10190
Closed
1 task done

Unable to Infer from a trained custom model #10180

hey24sheep opened this issue Nov 17, 2022 · 7 comments · Fixed by #10190
Labels
question Further information is requested

Comments

@hey24sheep
Copy link

Search before asking

Question

Hi, I am facing an issue while trying to infer data. What am I doing wrong here? Can anyone please help me.

Trained like

!python -m torch.distributed.run --nproc_per_node 2 yolov5/train.py --data train.yaml --weights yolov5m6.pt --img 1280 --epochs 2 --device 0,1

Infering Like

!python yolov5/detect.py --weights 'last.pt' --imgsz 1280 --source 'test.jpg' --data 'train.yaml'

# I also tried without "--data" arg
# python yolov5/detect.py --weights 'last.pt' --imgsz 1280 --source 'test.jpg'

Error


detect: weights=['last.pt'], source=test.jpg, data=train.yaml, imgsz=[1280, 1280], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=yolov5/runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v6.2-243-g5e03f5f Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla T4, 15110MiB)

Fusing layers... 
Model summary: 276 layers, 35341272 parameters, 0 gradients, 49.1 GFLOPs
Traceback (most recent call last):
  File "yolov5/detect.py", line 258, in <module>
    main(opt)
  File "yolov5/detect.py", line 253, in main
    run(**vars(opt))
  File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "yolov5/detect.py", line 95, in run
    model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
  File "/kaggle/working/yolov5/models/common.py", line 501, in __init__
    if names[0] == 'n01440764' and len(names) == 1000:  # ImageNet
KeyError: 0

Other infer code

model = torch.hub.load('./yolov5', 'custom', path='last.pt', source='local') 
model.to('cuda')
img = cv2.imread("test.jpg")
img = cv2.resize(img, (1280, 1280))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# I tried the below code as well, it doesn't work
# im = torch.from_numpy(img).to('cuda')
# im = im.half() if True else im.float()  # uint8 to fp16/32
# im /= 255  # 0 - 255 to 0.0 - 1.0
# if len(im.shape) == 3:
    # im = im[None]  # expand for batch dim

results = model(img)

Error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_23/1415234167.py in <module>
     14 # print(t_img.shape)
     15 # Inference
---> 16 results = model(img)
     17 # Results, change the flowing to: results.show()
     18 results.show()  # or .show(), .save(), .crop(), .pandas(), etc

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

/kaggle/working/yolov5/models/yolo.py in forward(self, x, augment, profile, visualize)
    207         if augment:
    208             return self._forward_augment(x)  # augmented inference, None
--> 209         return self._forward_once(x, profile, visualize)  # single-scale inference, train
    210 
    211     def _forward_augment(self, x):

/kaggle/working/yolov5/models/yolo.py in _forward_once(self, x, profile, visualize)
    119             if profile:
    120                 self._profile_one_layer(m, x, dt)
--> 121             x = m(x)  # run
    122             y.append(x if m.i in self.save else None)  # save output
    123             if visualize:

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

/kaggle/working/yolov5/models/common.py in forward(self, x)
     55 
     56     def forward(self, x):
---> 57         return self.act(self.bn(self.conv(x)))
     58 
     59     def forward_fuse(self, x):

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1109                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1110             return forward_call(*input, **kwargs)
   1111         # Do not call functions when jit is used
   1112         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    445 
    446     def forward(self, input: Tensor) -> Tensor:
--> 447         return self._conv_forward(input, self.weight, self.bias)
    448 
    449 class Conv3d(_ConvNd):

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
    442                             _pair(0), self.dilation, self.groups)
    443         return F.conv2d(input, weight, bias, self.stride,
--> 444                         self.padding, self.dilation, self.groups)
    445 
    446     def forward(self, input: Tensor) -> Tensor:

TypeError: conv2d() received an invalid combination of arguments - got (numpy.ndarray, Parameter, NoneType, tuple, tuple, tuple, int), but expected one of:
 * (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups)
      didn't match because some of the arguments have invalid types: (!numpy.ndarray!, !Parameter!, !NoneType!, !tuple!, !tuple!, !tuple!, int)
 * (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups)
      didn't match because some of the arguments have invalid types: (!numpy.ndarray!, !Parameter!, !NoneType!, !tuple!, !tuple!, !tuple!, int)

Additional

No response

@hey24sheep hey24sheep added the question Further information is requested label Nov 17, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Nov 17, 2022

👋 Hello @hey24sheep, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@hey24sheep
Copy link
Author

Infer like

model = torch.hub.load('./yolov5', 'custom', path='last.pt', source='local') 
model.to('cuda')
img = cv2.imread("test.jpg")
img = cv2.resize(img, (1280, 1280))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
im = torch.from_numpy(img).to('cuda')
results = model(im)

Error

RuntimeError: Given groups=1, weight of size [48, 3, 6, 6], expected input[1, 1280, 1280, 3] to have 3 channels, but got 1280 channels instead

@hey24sheep
Copy link
Author

If I do this

im1 = torch.reshape(im, (1, 3, 1280, 1280))

results = model(im1.float())

It gives me a tuple of 2 values (tensor type)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_23/3195347922.py in <module>
     19 # Results, change the flowing to: results.show()
     20 # results.show()  # or .show(), .save(), .crop(), .pandas(), etc
---> 21 results.save()
     22 print(results.xyxy[0])
     23 # print(results)

AttributeError: 'tuple' object has no attribute 'save'

@hey24sheep
Copy link
Author

Found the problem,

model.names

Output is

{'0': '1011',
 '1': '1012',
 '2': '1013',
 '3': '1100',
 '4': '2050',
 '5': '1040',
 '6': '1030',
 '7': '1120',
 '8': '1110',
 '9': '1135',
 '10': '4000',
 '11': '5010',
 '12': '1003',
 '13': '1002',
 '14': '1070',
 '15': '2010',
 '16': '2000'}

So, the key error is names[0] instead of names['0'].

 File "/kaggle/working/yolov5/models/common.py", line 501, in __init__
    if names[0] == 'n01440764' and len(names) == 1000: 

@glenn-jocher
Copy link
Member

glenn-jocher commented Nov 17, 2022

@hey24sheep maybe your data yaml is formatted incorrectly. To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:

1.1 Create dataset.yaml

COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path and relative paths to train / val / test image directories (or *.txt files with image paths) and 2) a class names dictionary:

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val: images/train2017  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes (80 COCO classes)
names:
  0: person
  1: bicycle
  2: car
  ...
  77: teddy bear
  78: hair drier
  79: toothbrush

1.2 Create Labels

After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). The *.txt file specifications are:

  • One row per object
  • Each row is class x_center y_center width height format.
  • Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
  • Class numbers are zero-indexed (start from 0).

The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):

1.3 Organize Directories

Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128 is inside a /datasets directory next to the /yolov5 directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/ in each image path with /labels/. For example:

../datasets/coco128/images/im0.jpg  # image
../datasets/coco128/labels/im0.txt  # label

Good luck 🍀 and let us know if you have any other questions!

@hey24sheep
Copy link
Author

Yes, the issue was with data.yaml "names". It has single quotes for keys. Thank you.

@glenn-jocher
Copy link
Member

@hey24sheep you're welcome! I'm glad to hear that the issue has been resolved. If you have any more questions or need further assistance, feel free to ask. Happy coding!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants