Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I got a runtimerror when I run classifier.py to train my own dataset. #2438

Closed
UNeedCryDear opened this issue Mar 12, 2021 · 10 comments
Closed

Comments

@UNeedCryDear
Copy link

It's me again.
First,The compiler will report an error of“fill”. This should not be fill, but fillcolor() in the old classifer.py.

shear=(-1, 1, -1, 1), fill=(114, 114, 114)),

Second,after I change fill to fillcolor, it's OK to run MNIST, but once I run my own data, it will report an error. The error message is as follows:
`epoch gpu_mem train_loss val_loss accuracy
1/50 0.124G 3.23 : 97%|█████████▋| 34/35 [00:19<00:00, 1.72it/s]
Traceback (most recent call last):
File "D:/PyCharm/yolov5-classifier/classifier.py", line 240, in
train()
File "D:/PyCharm/yolov5-classifier/classifier.py", line 145, in train
fitness = test(model, testloader, names, criterion, pbar=pbar) # test
File "D:/PyCharm/yolov5-classifier/classifier.py", line 184, in test
for images, labels in dataloader:
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
data.reraise()
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data_utils\worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data_utils\fetch.py", line 47, in fetch
return self.collate_fn(data)
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data_utils\collate.py", line 83, in default_collate
return [default_collate(samples) for samples in transposed]
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data_utils\collate.py", line 83, in
return [default_collate(samples) for samples in transposed]
File "C:\anaconda3\envs\pytorch\lib\site-packages\torch\utils\data_utils\collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [3, 90, 59] at entry 0 and [3, 37, 25] at entry 1

`
How to solve this problem?

@glenn-jocher
Copy link
Member

@UNeedCryDear pytorch 1.8.0 uses the new 'fill' term, and the code has been updated to the 1.8.0 standard recently. Correct usage of the file is shown in the comments at the top, you can use this to get started:

yolov5/classifier.py

Lines 1 to 2 in 058d667

# YOLOv5 classifier training
# Usage: python classifier.py --model yolov5s --data mnist --epochs 10 --img 128

@UNeedCryDear
Copy link
Author

@glenn-jocher It's OK to run MNIST with pytorch1.7.1, but once I run my own dataset to train my calssifiter, it will report a runtimeError:
RuntimeError: stack expects each tensor to be equal size, but got [3, 90, 59] at entry 0 and [3, 37, 25] at entry 1

@glenn-jocher
Copy link
Member

@UNeedCryDear if everything works with MNIST, CIFAR, etc, and then it stops working on your custom dataset, the culprit would be your dataset. I'm not sure there's anything I can do about your dataset other than to recommend you use the existing datasets as templates to get started with your custom dataset.

@CE-Noob
Copy link

CE-Noob commented Mar 16, 2021

@glenn-jocher Thank you for your work. I also encountered this situation. It could run on MNIST, but when I changed the size of an image in MNIST,I would report an error:

RuntimeError: stack expects each tensor to be equal size, but got [3, 28, 28] at entry 0 and [3, 522, 523]

28X28 is the size of MNIST size,522X523 is changed.
Even though I've used T.Resize((imgsz,imgsz)) , it doesn't seem to have been able to resize the image.

# T.Resize([imgsz, imgsz]), # very slow

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 18, 2021

@CE-Noob @UNeedCryDear if you are passing in differently shaped images for classifier training you'll need to update both the train and testloaders appropriately in order that they output all identically shaped images, otherwise the batching operation will fail:

yolov5/classifier.py

Lines 54 to 63 in 058d667

# Transforms
trainform = T.Compose([T.RandomGrayscale(p=0.01),
T.RandomHorizontalFlip(p=0.5),
T.RandomAffine(degrees=1, translate=(.2, .2), scale=(1 / 1.5, 1.5),
shear=(-1, 1, -1, 1), fill=(114, 114, 114)),
# T.Resize([imgsz, imgsz]), # very slow
T.ToTensor(),
T.Normalize((0.5, 0.5, 0.5), (0.25, 0.25, 0.25))]) # PILImage from [0, 1] to [-1, 1]
testform = T.Compose(trainform.transforms[-2:])

EDIT: remember current testloader is slicing trainloader for last two ops only, so you'll need to include the resize op in testloader.

@UNeedCryDear
Copy link
Author

@glenn-jocher Thanks.

@mmez-11
Copy link

mmez-11 commented Mar 30, 2021

@UNeedCryDear Hi, i also face that fill problem and data directory problem. Could you guide me to get classifier weights? Thanks!

@UNeedCryDear
Copy link
Author

UNeedCryDear commented Mar 30, 2021

@mmez-11
Modify this code.
''' trainform = T.Compose([T.RandomGrayscale(p=0.01),
T.RandomHorizontalFlip(p=0.5),
T.RandomAffine(degrees=1, translate=(.2, .2), scale=(1 / 1.5, 1.5),
shear=(-1, 1, -1, 1), fill=(114, 114, 114)),
T.Resize([imgsz, imgsz]), # use resize
T.ToTensor(),
T.Normalize((0.5, 0.5, 0.5), (0.25, 0.25, 0.25))]) # PILImage from [0, 1] to [-1, 1]
testform = T.Compose(trainform.transforms[-3:])#[-2] to [-3]'''

I can't make format right. But you should be able to understand how to modify it.

@mmez-11
Copy link

mmez-11 commented Mar 30, 2021

@UNeedCryDear Thanks for your kindness but i am still in problem. i got this prob below!
Traceback (most recent call last):
File "classifier.py", line 247, in
train()
File "classifier.py", line 58, in train
shear=(-1, 1, -1, 1), fill=(114, 114, 114)),
TypeError: init() got an unexpected keyword argument 'fill'

@UNeedCryDear
Copy link
Author

@mmez-11 #2438 (comment)
Look at the reply above. I've already asked this question. with pytorch1.7, use fillcolor().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants