Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yolov5-6.0 Specific Bug: The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 4, 4, 2] #5234

Closed
SpaceView opened this issue Oct 18, 2021 · 29 comments
Labels
bug Something isn't working Stale

Comments

@SpaceView
Copy link

SpaceView commented Oct 18, 2021

This is a bug specific to Yolov5-6.0; Yolov5-5.0 doesn't have this problem.
How to Reproduce the bug,

Platform: Windows 10
Python: 3.9.7
Torch: 1.9.1

(step.1) in detect.py, change the following items,
def parse_opt():
    parser = argparse.ArgumentParser()
    #parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/yolov5s.pt', help='model path(s)')
    parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'weights/yolov5n.pt', help='model path(s)')
    parser.add_argument('--source', type=str, default=ROOT / 'data/images/bus.jpg', help='file/dir/URL/glob, 0 for webcam')

(step.2) Open the project with vscode, the root directory is "d:/yolov5-master", 
By the way, I also tested with yolov5-6.0, root directory at "d:/yolov5-6.0", it gives the same error.

(step.3) Run the detect.py script in vscode

The error info is given as below

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: forward)
The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3.  Target sizes: [1, 3, 1, 1, 2].  Tensor sizes: [3, 4, 4, 2]
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 61, in forward (Current frame)
    self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
  File "D:\Anaconda3\envs\torch\Lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 149, in _forward_once
    x = m(x)  # run
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 126, in forward
    return self._forward_once(x, profile, visualize)  # single-scale inference, train
  File "D:\Anaconda3\envs\torch\Lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\Anaconda3\envs\torch\Lib\site-packages\thop\profile.py", line 188, in profile
    model(*inputs)
  File "D:\vsAI\yolov5-6.0\utils\torch_utils.py", line 236, in model_info
    flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2  # stride GFLOPs
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 235, in info
    model_info(self, verbose, img_size)
  File "D:\vsAI\yolov5-6.0\models\yolo.py", line 225, in fuse
    self.info()
  File "D:\vsAI\yolov5-6.0\models\experimental.py", line 96, in attempt_load
    model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().fuse().eval())  # FP32 model
  File "D:\vsAI\yolov5-6.0\detect.py", line 82, in run
    model = torch.jit.load(w) if 'torchscript' in w else attempt_load(weights, map_location=device)
  File "D:\Anaconda3\envs\torch\Lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "D:\vsAI\yolov5-6.0\detect.py", line 302, in main
    run(**vars(opt))
  File "D:\vsAI\yolov5-6.0\detect.py", line 307, in <module>
    main(opt)
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Anaconda3\envs\torch\Lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,

it seems that the following item has some problem,

self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

I use the following equivalent code to debug it

tmp_grid, tmp_anchor_grid =  self._make_grid(nx, ny, i)
self.grid[i] = tmp_grid
self.anchor_grid[i] = tmp_anchor_grid

and found that when i==0:
self.anchor_grid[0].shape -- >torch.Size([1, 3, 1, 1, 2])
tmp_anchor_grid.shape -- > torch.Size([1, 3, 4, 4, 2])

The problem seems coming from the thop.profile,

flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2  # stride GFLOPS

Currently I have no idea how these come out to be so, where is the self.anchor_grid[0] coming from?

When I run the script in windows powershell command console, I got no such a bug a, as below,

$ python detect.py --source ./data/images/bus.jpg
detect: weights=yolov5s.pt, source=./data/images/bus.jpg, imgsz=[640, 640], conf_thres=0.2
0, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False
alse, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_o
e_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5  2021-10-15 torch 1.9.1 CUDA:0 (GeForce GTX 1080 Ti, 11264.0MB)

Fusing layers...
Model Summary: 213 layers, 7225885 parameters, 0 gradients
image 1/1 D:\vsAI\yolov5-master\data\images\bus.jpg: 640x480 4 persons, 1 bus, Done. (0.00
Speed: 1.0ms pre-process, 8.0ms inference, 5.0ms NMS per image at shape (1, 3, 640, 640)
@SpaceView SpaceView added the bug Something isn't working label Oct 18, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Oct 18, 2021

👋 Hello @SpaceView, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

$ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@SpaceView SpaceView changed the title Bug: The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 4, 4, 2] Yolov5-6.0 Specific Bug: The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 4, 4, 2] Oct 18, 2021
@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 19, 2021

@SpaceView thanks for the bug report. This might just be due to out of date code or models. I tested this locally in PyCharm MacOS with python 3.9 and everything seems fine:

Screen Shot 2021-10-19 at 1 48 29 PM

The CI tests regularly run YOLOv5n with all main functions (train, val, detect, export) on Windows also and they are green currently:
https://github.com/ultralytics/yolov5/runs/3937706191?check_suite_focus=true

@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 19, 2021

@fcakyon @SpaceView I'm not able to reproduce any error here. The following two examples execute correctly in Colab.

!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5n.pt
!python detect.py --weights runs/train/exp/weights/best.pt

!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights '' --cfg yolov5n.yaml
!python detect.py --weights runs/train/exp2/weights/best.pt

Response from detect.py calls is:

detect: weights=['runs/train/exp/weights/best.pt'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.0-23-ga18b0c3 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 213 layers, 1867405 parameters, 0 gradients, 4.5 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, 1 skateboard, Done. (0.015s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 2 persons, 1 tie, Done. (0.016s)
Speed: 0.4ms pre-process, 15.3ms inference, 1.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp


detect: weights=['runs/train/exp2/weights/best.pt'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.0-23-ga18b0c3 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)

Fusing layers... 
Model Summary: 213 layers, 1867405 parameters, 0 gradients, 4.5 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 Done. (0.016s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 Done. (0.017s)
Speed: 0.4ms pre-process, 16.4ms inference, 0.4ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp2

We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

  • Minimal – Use as little code as possible that still produces the same problem
  • Complete – Provide all parts someone else needs to reproduce your problem in the question itself
  • Reproducible – Test the code you're about to provide to make sure it reproduces the problem

In addition to the above requirements, for Ultralytics to provide assistance your code should be:

  • Current – Verify that your code is up-to-date with current GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been resolved by previous commits.
  • Unmodified – Your problem must be reproducible without any modifications to the codebase in this repository. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all of the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template and providing a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

@glenn-jocher
Copy link
Member

I also trained a new model from a custom trained model (exp2/weights/best.pt), and detecting again with the new exp3/weights/best.pt, everything worked correctly:

!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights runs/train/exp2/weights/best.pt
!python detect.py --weights runs/train/exp3/weights/best.pt

@jebastin-nadar
Copy link
Contributor

Hi @SpaceView @fcakyon, yes the bug originates from my PR. I have tried to reproduce the error with pre-trained and custom-trained yolov5n from scratch (similar code as @glenn-jocher), but detect.py works correctly with both models.


self.anchor_grid is supposed to be a list of Tensors, but from the error message, it looks like self.anchor_grid is a Tensor (it was a Tensor before my PR was merged) and assigning a Tensor of different shape is raising this error.
You can check this by adding print(type(self.anchor_grid)) in forward() of Detect module.

This conversion of Tensor to list of Tensors is done in attempt_load()

if not isinstance(m.anchor_grid, list): # new Detect Layer compatibility
delattr(m, 'anchor_grid')
setattr(m, 'anchor_grid', [torch.zeros(1)] * m.nl)

and I see that this function is being called during runtime

File "D:\vsAI\yolov5-6.0\models\experimental.py", line 96, in attempt_load

Compatibility with models trained before my PR was checked before merging it, so it's quite strange to see this bug. As suggested by Glenn, some more reproducer code/models are needed.

@fcakyon
Copy link
Member

fcakyon commented Oct 19, 2021

@glenn-jocher @SamFC10 the error is raised when a model trained on 5.0 source is used with detect.py from 6.0 source. Compatibility addition seems to be not working for some reason.

@jebastin-nadar
Copy link
Contributor

@fcakyon Please add a link to your trained model if possible. Some edge case is being missed.

@fcakyon
Copy link
Member

fcakyon commented Oct 19, 2021

I cannot add it for privacy reasons, will try to train a redundant model for reproducability.

@SpaceView
Copy link
Author

SpaceView commented Oct 19, 2021

@glenn-jocher @fcakyon @SamFC10
Great thanks for your attention, I use "vscode" in windows 10. If you don't use it, it may pass the thop.profile without any warning, so you need to add some additional info to reproduce this bug, as below:

def model_info(model, verbose=False, img_size=640):
    # Model information. img_size may be int or list, i.e. img_size=640 or img_size=[640, 320]
    n_p = sum(x.numel() for x in model.parameters())  # number parameters
    n_g = sum(x.numel() for x in model.parameters() if x.requires_grad)  # number gradients
    if verbose:
        print('%5s %40s %9s %12s %20s %10s %10s' % ('layer', 'name', 'gradient', 'parameters', 'shape', 'mu', 'sigma'))
        for i, (name, p) in enumerate(model.named_parameters()):
            name = name.replace('module_list.', '')
            print('%5g %40s %9s %12g %20s %10.3g %10.3g' %
                  (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std()))

    try:  # FLOPs
        from thop import profile
        stride = max(int(model.stride.max()), 32) if hasattr(model, 'stride') else 32
        img = torch.zeros((1, model.yaml.get('ch', 3), stride, stride), device=next(model.parameters()).device)  # input
        print('Now it is time to show the bug, -------------------> for debug purpose \n')  # 
        flops = profile(deepcopy(model), inputs=(img,), verbose=False)[0] / 1E9 * 2  # stride GFLOPs
        print('Can we print this out correctly?--- if NOT, here it is a problem, -------------------> for debug purpose\n')   
        img_size = img_size if isinstance(img_size, list) else [img_size, img_size]  # expand if int/float
        fs = ', %.1f GFLOPs' % (flops * img_size[0] / stride * img_size[1] / stride)  # 640x640 GFLOPs
    except (ImportError, Exception):
        fs = ''

    LOGGER.info(f"Model Summary: {len(list(model.modules()))} layers, {n_p} parameters, {n_g} gradients{fs}")

As you can see, in the model_info, I add 2 "print"s for debug. If this thop.profile works correctly, the 2 lines should print out correctly.

My output log is given as follows, you can see that only the first debug line is shown, while the second line is not, which means the thop.profile is by-passed by internal error break from python, consequently causing the coming lines un-excuted.

(torch) PS D:\vsAI\yolov5-6.0>  d:; cd 'd:\vsAI\yolov5-6.0'; & 'D:\Anaconda3\envs\torch\python.exe' 'c:\Users\Administrator\.vscode\extensions\ms-python.python-2021.10.1336267007\pythonFiles\lib\python\debugpy\launcher' '58690' '--' 'd:\vsAI\yolov5-6.0\detect.py' 
detect: weights=weights\yolov5n.pt, source=data\images\bus.jpg, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs\detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False    
YOLOv5  2021-10-20 torch 1.9.1 CUDA:0 (GeForce GTX 1080 Ti, 11264.0MB)

Fusing layers... 
Now it is time to show the bug, -------------------> for debug purpose 

Model Summary: 213 layers, 1867405 parameters, 0 gradients
attemp_load_done
image 1/1 D:\vsAI\yolov5-6.0\data\images\bus.jpg: debug
640x480 4 persons, 1 bus, 1 skateboard, Done. (0.026s)
Speed: 2.0ms pre-process, 26.0ms inference, 9.0ms NMS per image at shape (1, 3, 640, 640)

It is easy, you can check it as I did.

I will look further into this problem in the next couple of days if I have time, from training to evaluation.
I suppose it is caused by some mismatch of the anchor_grid setting somewhere, and it seems the thop.profile can accept tensor.expansion, but not direct tensor replacement. NO idea why this happens in PYTHON. It seems the issue #4833 has caused this problem.

By the way, I use the yolov5-6.0 model and 5.0 model from your release archive, they give the same results.

@SpaceView
Copy link
Author

I may have find out the reason, the error has something to do with Python's intrinsic tensor expansion mechanism (dimension matching), @fcakyon is right,

@glenn-jocher @SamFC10 the error is raised when a model trained on 5.0 source is used with detect.py from 6.0 source. Compatibility addition seems to be not working for some reason.

I use the latest code and had a short training, the error disappeared when using the my trained results. If I use the downloaded model (e.g. Yolov5n.pt), the error pops up.

@jebastin-nadar
Copy link
Contributor

jebastin-nadar commented Oct 20, 2021

the error is raised when a model trained on 5.0 source is used

@SpaceView As I've mentioned above, please add links to your trained model if possible, so that the error can be reproduced from my side and debugged.

@RaZzzyz
Copy link

RaZzzyz commented Oct 20, 2021

I meet this problem when I try the simple example in https://docs.ultralytics.com/tutorials/pytorch-hub/.
I use the 6.0 yolov5s.pt

@jebastin-nadar
Copy link
Contributor

@RaZzzyz Cannot reproduce the bug using the simple example mentioned in the link. I'm using Google Colab with the latest branch and model.

yolov5-reproducer

@SpaceView
Copy link
Author

the error is raised when a model trained on 5.0 source is used

@SpaceView As I've mentioned above, please add links to your trained model if possible, so that the error can be reproduced from my side and debugged.

@SamFC10
Please read my answer slowly, I have supply all the infor you need,
the model is from ultralytics, e.g.

https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.pt
https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5n.pt

To reproduce the issue please read my 2nd previous answer, surely you cannot print those 2 lines at the same time if you use old trained model, though no exception is raised.

I suppose this issue can be closed. If you train the model using the latest code, there will be no problem.

@yamand16
Copy link

yamand16 commented Nov 8, 2021

Hi all,

I am getting the same error. All details that @SpaceView and @SamFC10 mentioned are almost the same for me. I did not train my own model. I'm just trying to run the existing model. And torch.load row (self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)) throws an error like "RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]".

By the way, I tried both 5.0 and 6.0 pretrained models.

@glenn-jocher
Copy link
Member

glenn-jocher commented Nov 8, 2021

@yamand16 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

  • Minimal – Use as little code as possible to produce the problem
  • Complete – Provide all parts someone else needs to reproduce the problem
  • Reproducible – Test the code you're about to provide to make sure it reproduces the problem

For Ultralytics to provide assistance your code should also be:

  • Current – Verify that your code is up-to-date with GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been solved in master.
  • Unmodified – Your problem must be reproducible using official YOLOv5 code without changes. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2021

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@atremblay-rayhawk
Copy link

I wanted to chime in here that I as well ran into this issue. I wanted to wait until we updated to the most recent set of code hoping it would be resolved but unfortunately not.

We've had to temporary patch this call:

if self.onnx_dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
   self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)

to

if self.onnx_dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
   self.grid[i] = self._make_grid(nx, ny).to(x[i].device)

and

if self.inplace:
    y[..., 0:2] = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
    xy = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh

to

if self.inplace:
    y[..., 0:2] = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh
else:  # for YOLOv5 on AWS Inferentia https://github.com/ultralytics/yolov5/pull/2953
    xy = (y[..., 0:2] * 2 - 0.5 + self.grid[i]) * self.stride[i]  # xy
    wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2)

and then revert the _make_grid function back to:

@staticmethod
def _make_grid(nx=20, ny=20):
    yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])
    return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()

And everything works as expected. If not, we get the same error that has been listed before.

@glenn-jocher
Copy link
Member

glenn-jocher commented Dec 24, 2021

@atremblay-rayhawk hi, thanks you for your fix suggestion on how to improve YOLOv5 🚀!

The fastest and easiest way to incorporate your ideas into the official codebase is to submit a Pull Request (PR) implementing your idea, and if applicable providing before and after profiling/inference/training results to help us understand the improvement your feature provides. This allows us to directly see the changes in the code and to understand how they affect workflows and performance.

Please see our ✅ Contributing Guide to get started.

@gg22mm
Copy link

gg22mm commented Jan 8, 2022

This should be because it is not supported now.

1

model = torch.load('./weights/yolov5s.pt', map_location=device)['model'].float() # load to FP32

#2 That's all:
model= DetectMultiBackend('./weights/yolov5s.pt', device=device, dnn=False) #this is OK !


But I prefer the first one. I don't want to do so complex encapsulation

@glenn-jocher
Copy link
Member

glenn-jocher commented Jan 8, 2022

@gg22mm YOLOv5 models can be loaded any way you want. Your problem is not reproducible:

Screen Shot 2022-01-08 at 11 40 06 AM

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

  • Minimal – Use as little code as possible to produce the problem
  • Complete – Provide all parts someone else needs to reproduce the problem
  • Reproducible – Test the code you're about to provide to make sure it reproduces the problem

For Ultralytics to provide assistance your code should also be:

  • Current – Verify that your code is up-to-date with GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been solved in master.
  • Unmodified – Your problem must be reproducible using official YOLOv5 code without changes. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

@deepxiaobai
Copy link

I am getting the same error.

models\yolo.py line 59
self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]

@glenn-jocher
Copy link
Member

glenn-jocher commented Feb 28, 2022

@deepxiaobai 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

  • Minimal – Use as little code as possible to produce the problem
  • Complete – Provide all parts someone else needs to reproduce the problem
  • Reproducible – Test the code you're about to provide to make sure it reproduces the problem

For Ultralytics to provide assistance your code should also be:

  • Current – Verify that your code is up-to-date with GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been solved in master.
  • Unmodified – Your problem must be reproducible using official YOLOv5 code without changes. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

@ozett
Copy link

ozett commented Mar 19, 2022

i cannot help with code or analysis,
but here is a model wich gives me the same error in the doods2 environment.

maybe someone needs such a model for further testing?

https://github.com/OlafenwaMoses/DeepStack_OpenLogo/releases/download/v1/openlogo.pt

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 19, 2022

@ozett 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem.

How to create a Minimal, Reproducible Example

When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:

  • Minimal – Use as little code as possible to produce the problem
  • Complete – Provide all parts someone else needs to reproduce the problem
  • Reproducible – Test the code you're about to provide to make sure it reproduces the problem

For Ultralytics to provide assistance your code should also be:

  • Current – Verify that your code is up-to-date with GitHub master, and if necessary git pull or git clone a new copy to ensure your problem has not already been solved in master.
  • Unmodified – Your problem must be reproducible using official YOLOv5 code without changes. Ultralytics does not provide support for custom code ⚠️.

If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem.

Thank you! 😃

@billalkuet07
Copy link

I have trained a model with v5.0, saved the model and trying to load with v6.1. I am getting following error :

File "/workercode/./yolov5/models/common.py", line 439, in forward
y = self.model(im, augment=augment, visualize=visualize)[0]
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/workercode/./yolov5/models/yolo.py", line 137, in forward
return self._forward_once(x, profile, visualize) # single-scale inference, train
File "/workercode/./yolov5/models/yolo.py", line 160, in _forward_once
x = m(x) # run
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/workercode/./yolov5/models/yolo.py", line 65, in forward
self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
RuntimeError: The expanded size of the tensor (1) must match the existing size (20) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 20, 20, 2]

Is there any sugegssion that can help me???

@glenn-jocher
Copy link
Member

Train a new model with the latest code.

@JAYANTH-MOHAN
Copy link

Yeah i got the same error. However corrected it
Here are the steps to correct -->
1.) make sure u cloned master branch
2.) take models weights from latest yolo5 , Never put previous yolo versions weights(.pt file) to latest , it gives non-singleton dimension 3 error . This is how i corrected my error All the Best

@glenn-jocher
Copy link
Member

@JAYANTH-MOHAN thanks for sharing your solution! This will be helpful for others who encounter similar issues. If you have any other questions or need further assistance, feel free to ask. Good luck with your YOLOv5 project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests