-
-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yolov5-6.0 Specific Bug: The expanded size of the tensor (1) must match the existing size (4) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 4, 4, 2] #5234
Comments
👋 Hello @SpaceView, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution. If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available. For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com. RequirementsPython>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started: $ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt EnvironmentsYOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit. |
@SpaceView thanks for the bug report. This might just be due to out of date code or models. I tested this locally in PyCharm MacOS with python 3.9 and everything seems fine: The CI tests regularly run YOLOv5n with all main functions (train, val, detect, export) on Windows also and they are green currently: |
@fcakyon @SpaceView I'm not able to reproduce any error here. The following two examples execute correctly in Colab. !python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5n.pt
!python detect.py --weights runs/train/exp/weights/best.pt
!python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights '' --cfg yolov5n.yaml
!python detect.py --weights runs/train/exp2/weights/best.pt Response from detect.py calls is: detect: weights=['runs/train/exp/weights/best.pt'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.0-23-ga18b0c3 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)
Fusing layers...
Model Summary: 213 layers, 1867405 parameters, 0 gradients, 4.5 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 4 persons, 1 bus, 1 skateboard, Done. (0.015s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 2 persons, 1 tie, Done. (0.016s)
Speed: 0.4ms pre-process, 15.3ms inference, 1.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp
detect: weights=['runs/train/exp2/weights/best.pt'], source=data/images, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.0-23-ga18b0c3 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16280.875MB)
Fusing layers...
Model Summary: 213 layers, 1867405 parameters, 0 gradients, 4.5 GFLOPs
image 1/2 /content/yolov5/data/images/bus.jpg: 640x480 Done. (0.016s)
image 2/2 /content/yolov5/data/images/zidane.jpg: 384x640 Done. (0.017s)
Speed: 0.4ms pre-process, 16.4ms inference, 0.4ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp2 We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem. How to create a Minimal, Reproducible ExampleWhen asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
In addition to the above requirements, for Ultralytics to provide assistance your code should be:
If you believe your problem meets all of the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template and providing a minimum reproducible example to help us better understand and diagnose your problem. Thank you! 😃 |
I also trained a new model from a custom trained model (exp2/weights/best.pt), and detecting again with the new exp3/weights/best.pt, everything worked correctly:
|
Hi @SpaceView @fcakyon, yes the bug originates from my PR. I have tried to reproduce the error with pre-trained and custom-trained yolov5n from scratch (similar code as @glenn-jocher), but detect.py works correctly with both models. self.anchor_grid is supposed to be a list of Tensors, but from the error message, it looks like self.anchor_grid is a Tensor (it was a Tensor before my PR was merged) and assigning a Tensor of different shape is raising this error. This conversion of Tensor to list of Tensors is done in attempt_load() Lines 106 to 108 in a18b0c3
and I see that this function is being called during runtime
Compatibility with models trained before my PR was checked before merging it, so it's quite strange to see this bug. As suggested by Glenn, some more reproducer code/models are needed. |
@glenn-jocher @SamFC10 the error is raised when a model trained on 5.0 source is used with detect.py from 6.0 source. Compatibility addition seems to be not working for some reason. |
@fcakyon Please add a link to your trained model if possible. Some edge case is being missed. |
I cannot add it for privacy reasons, will try to train a redundant model for reproducability. |
@glenn-jocher @fcakyon @SamFC10
As you can see, in the model_info, I add 2 "print"s for debug. If this thop.profile works correctly, the 2 lines should print out correctly. My output log is given as follows, you can see that only the first debug line is shown, while the second line is not, which means the thop.profile is by-passed by internal error break from python, consequently causing the coming lines un-excuted.
It is easy, you can check it as I did. I will look further into this problem in the next couple of days if I have time, from training to evaluation. By the way, I use the yolov5-6.0 model and 5.0 model from your release archive, they give the same results. |
I may have find out the reason, the error has something to do with Python's intrinsic tensor expansion mechanism (dimension matching), @fcakyon is right,
I use the latest code and had a short training, the error disappeared when using the my trained results. If I use the downloaded model (e.g. Yolov5n.pt), the error pops up. |
@SpaceView As I've mentioned above, please add links to your trained model if possible, so that the error can be reproduced from my side and debugged. |
I meet this problem when I try the simple example in https://docs.ultralytics.com/tutorials/pytorch-hub/. |
@RaZzzyz Cannot reproduce the bug using the |
@SamFC10
To reproduce the issue please read my 2nd previous answer, surely you cannot print those 2 lines at the same time if you use old trained model, though no exception is raised. I suppose this issue can be closed. If you train the model using the latest code, there will be no problem. |
Hi all, I am getting the same error. All details that @SpaceView and @SamFC10 mentioned are almost the same for me. I did not train my own model. I'm just trying to run the existing model. And torch.load row (self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)) throws an error like "RuntimeError: The expanded size of the tensor (1) must match the existing size (80) at non-singleton dimension 3. Target sizes: [1, 3, 1, 1, 2]. Tensor sizes: [3, 48, 80, 2]". By the way, I tried both 5.0 and 6.0 pretrained models. |
@yamand16 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem. How to create a Minimal, Reproducible ExampleWhen asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem. Thank you! 😃 |
👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs. Access additional YOLOv5 🚀 resources:
Access additional Ultralytics ⚡ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed! Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐! |
I wanted to chime in here that I as well ran into this issue. I wanted to wait until we updated to the most recent set of code hoping it would be resolved but unfortunately not. We've had to temporary patch this call:
to
and
to
and then revert the
And everything works as expected. If not, we get the same error that has been listed before. |
@atremblay-rayhawk hi, thanks you for your fix suggestion on how to improve YOLOv5 🚀! The fastest and easiest way to incorporate your ideas into the official codebase is to submit a Pull Request (PR) implementing your idea, and if applicable providing before and after profiling/inference/training results to help us understand the improvement your feature provides. This allows us to directly see the changes in the code and to understand how they affect workflows and performance. Please see our ✅ Contributing Guide to get started. |
This should be because it is not supported now. 1model = torch.load('./weights/yolov5s.pt', map_location=device)['model'].float() # load to FP32 #2 That's all: But I prefer the first one. I don't want to do so complex encapsulation |
@gg22mm YOLOv5 models can be loaded any way you want. Your problem is not reproducible: How to create a Minimal, Reproducible ExampleWhen asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem. Thank you! 😃 |
I am getting the same error. models\yolo.py line 59 |
@deepxiaobai 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem. How to create a Minimal, Reproducible ExampleWhen asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem. Thank you! 😃 |
i cannot help with code or analysis, maybe someone needs such a model for further testing? https://github.com/OlafenwaMoses/DeepStack_OpenLogo/releases/download/v1/openlogo.pt |
@ozett 👋 hi, thanks for letting us know about this possible problem with YOLOv5 🚀. We've created a few short guidelines below to help users provide what we need in order to get started investigating a possible problem. How to create a Minimal, Reproducible ExampleWhen asking a question, people will be better able to provide help if you provide code that they can easily understand and use to reproduce the problem. This is referred to by community members as creating a minimum reproducible example. Your code that reproduces the problem should be:
For Ultralytics to provide assistance your code should also be:
If you believe your problem meets all the above criteria, please close this issue and raise a new one using the 🐛 Bug Report template with a minimum reproducible example to help us better understand and diagnose your problem. Thank you! 😃 |
I have trained a model with v5.0, saved the model and trying to load with v6.1. I am getting following error : File "/workercode/./yolov5/models/common.py", line 439, in forward Is there any sugegssion that can help me??? |
Train a new model with the latest code. |
Yeah i got the same error. However corrected it |
@JAYANTH-MOHAN thanks for sharing your solution! This will be helpful for others who encounter similar issues. If you have any other questions or need further assistance, feel free to ask. Good luck with your YOLOv5 project! |
This is a bug specific to Yolov5-6.0; Yolov5-5.0 doesn't have this problem.
How to Reproduce the bug,
The error info is given as below
it seems that the following item has some problem,
I use the following equivalent code to debug it
and found that when i==0:
self.anchor_grid[0].shape -- >torch.Size([1, 3, 1, 1, 2])
tmp_anchor_grid.shape -- > torch.Size([1, 3, 4, 4, 2])
The problem seems coming from the thop.profile,
Currently I have no idea how these come out to be so, where is the self.anchor_grid[0] coming from?
When I run the script in windows powershell command console, I got no such a bug a, as below,
The text was updated successfully, but these errors were encountered: