After changing path of dataset, validation still searching image on old location #3349

GiorgioSgl · 2021-05-26T10:57:57Z

🐛 Bug

The bug is caused by the valid partition of the dataset. Actually I'm working with the OpenImage dataset of Google. The problem is very easy: I change the OS where I'm training, I pass to windows so all directory change and I can put the dataset in the same location, so what I have done is changin the data.yaml and change the derictory of the dataset. It's okay for the training but it's not okay for the Valid.

In the frist epochs is doing the training with train test without any problem, but when it's the time of the valid it's searching images on the old location. And is giving me an error sayng that it can find the first image of the valid set.

To Reproduce

Just use the train.py, change location of the dataset and also fo the data.yaml file

Output

Traceback (most recent call last): 
  File "train.py", line 543, in <module> 
    train(hyp, opt, device, tb_writer) 
  File "train.py", line 354, in train 
    results, maps, times = test.test(data_dict, 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context 
    return func(*args, **kwargs) 
  File "C:\Users\gofor\Desktop\yolov5-master\test.py", line 102, in test 
    for batch_i, (img, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)): 
  File "C:\Users\gofor\anaconda3\lib\site-packages\tqdm\std.py", line 1165, in iter 
    for obj in iterable: 
  File "C:\Users\gofor\Desktop\yolov5-master\utils\datasets.py", line 104, in iter 
    yield next(self.iterator) 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 517, in next 
    data = self._next_data() 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data 
    return self._process_data(data) 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data 
    data.reraise() 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\_utils.py", line 429, in reraise 
    raise self.exc_type(msg) 
AssertionError: Caught AssertionError in DataLoader worker process 0. 
Original Traceback (most recent call last): 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\utils\data\_utils\worker.py", line 202, in _worker_loop 
    data = fetcher.fetch(index) 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch 
    data = [self.dataset[idx] for idx in possibly_batched_index] 
  File "C:\Users\gofor\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp> 
    data = [self.dataset[idx] for idx in possibly_batched_index] 
  File "C:\Users\gofor\Desktop\yolov5-master\utils\datasets.py", line 540, in getitem 
    img, (h0, w0), (h, w) = load_image(self, index) 
  File "C:\Users\gofor\Desktop\yolov5-master\utils\datasets.py", line 638, in load_image 
    assert img is not None, 'Image Not Found ' + path 
AssertionError: Image Not Found /home/goforespain/Dataset/images/valid/d0776b45a6256287.jpg

Expected behavior

Work with the new location define in the data.yaml and not the old one

Environment

If applicable, add screenshots to help explain your problem.

OS: Windows
GPU: geoforce gtx 1070
CUDA: 11.1

The text was updated successfully, but these errors were encountered:

GiorgioSgl · 2021-05-26T11:10:41Z

I just notice that the valid.cache file has not been reinitalized when change location of the dataset, while the train yes. So i just remove it and see if something changed.

glenn-jocher · 2021-05-26T11:17:36Z

@GiorgioSgl thanks for the bug report! Yes this happening because the cache file saved the older directories, I should update the cache hash to recognize changes in dataset location in addition to dataset contents.

Possible fix for #3349

GiorgioSgl · 2021-05-26T11:35:49Z

@GiorgioSgl thanks for the bug report! Yes this happening because the cache file saved the older directories, I should update the cache hash to recognize changes in dataset location in addition to dataset contents.

Yeah that will be amazing! Thanks for the fast answer.

* Update cache v0.2 to include parent hash Possible fix for #3349 * Update datasets.py

glenn-jocher · 2021-05-26T12:28:29Z

@GiorgioSgl good news 😃! Your original issue may now be fixed ✅ in PR #3350. This PR implements a new hashlib-based solution for detecting changes to dataset contents or location, recaching as necessary when either is detected. This new system will force all YOLOv5 users to recache their existing datasets once, but this should occur automatically one time only and is not a breaking change. To receive this update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload with model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

GiorgioSgl · 2021-05-26T14:51:53Z

Gracias tío! El mejor de verdad!

* Update cache v0.2 to include parent hash Possible fix for ultralytics#3349 * Update datasets.py (cherry picked from commit c6b5bfc)

* Update cache v0.2 to include parent hash Possible fix for ultralytics#3349 * Update datasets.py

* Update cache v0.2 to include parent hash Possible fix for ultralytics/yolov5#3349 * Update datasets.py

GiorgioSgl added the bug Something isn't working label May 26, 2021

glenn-jocher added a commit that referenced this issue May 26, 2021

Update cache v0.2 to include parent hash

2a84878

Possible fix for #3349

glenn-jocher mentioned this issue May 26, 2021

Updated cache v0.2 with hashlib #3350

Merged

glenn-jocher linked a pull request May 26, 2021 that will close this issue

Updated cache v0.2 with hashlib #3350

Merged

glenn-jocher closed this as completed in #3350 May 26, 2021

glenn-jocher added a commit that referenced this issue May 26, 2021

Updated cache v0.2 with hashlib (#3350)

c6b5bfc

* Update cache v0.2 to include parent hash Possible fix for #3349 * Update datasets.py

Lechtr pushed a commit to Lechtr/yolov5 that referenced this issue Jul 20, 2021

Updated cache v0.2 with hashlib (ultralytics#3350)

0ec1b8a

* Update cache v0.2 to include parent hash Possible fix for ultralytics#3349 * Update datasets.py (cherry picked from commit c6b5bfc)

BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this issue Aug 26, 2022

Updated cache v0.2 with hashlib (ultralytics#3350)

6847e1a

* Update cache v0.2 to include parent hash Possible fix for ultralytics#3349 * Update datasets.py

SecretStar112 added a commit to SecretStar112/yolov5 that referenced this issue May 24, 2023

Updated cache v0.2 with hashlib (#3350)

ad2880c

* Update cache v0.2 to include parent hash Possible fix for ultralytics/yolov5#3349 * Update datasets.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After changing path of dataset, validation still searching image on old location #3349

After changing path of dataset, validation still searching image on old location #3349

GiorgioSgl commented May 26, 2021

GiorgioSgl commented May 26, 2021

glenn-jocher commented May 26, 2021

GiorgioSgl commented May 26, 2021

glenn-jocher commented May 26, 2021 •

edited

Loading

GiorgioSgl commented May 26, 2021

After changing path of dataset, validation still searching image on old location #3349

After changing path of dataset, validation still searching image on old location #3349

Comments

GiorgioSgl commented May 26, 2021

🐛 Bug

To Reproduce

Output

Expected behavior

Environment

GiorgioSgl commented May 26, 2021

glenn-jocher commented May 26, 2021

GiorgioSgl commented May 26, 2021

glenn-jocher commented May 26, 2021 • edited Loading

GiorgioSgl commented May 26, 2021

glenn-jocher commented May 26, 2021 •

edited

Loading