Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train in a custom dataset? #109

Closed
opassos opened this issue Feb 21, 2022 · 13 comments
Closed

How to train in a custom dataset? #109

opassos opened this issue Feb 21, 2022 · 13 comments

Comments

@opassos
Copy link

opassos commented Feb 21, 2022

I am having a bit of trouble training on a custom dataset. I reproduced the MVTec structure but I still get some errors like:

IndexError: too many indices for tensor of dimension 0

@opassos opassos closed this as completed Feb 21, 2022
@cvramanan
Copy link

how you solved this issue, I'm facing same issue while training with custom dataset

@samet-akcay
Copy link
Contributor

Hi @cvramanan, can you elaborate what sort of issues you are having with a custom dataset?

@birdortyedi
Copy link

birdortyedi commented Mar 1, 2022

I had the exact same issue, any solution?

@samet-akcay
Copy link
Contributor

@birdortyedi would you be able to share the full error logs from your terminal so that we could reproduce it?

@birdortyedi
Copy link

I have found that if your test data is not ready in the correct format, the Adaptive threshold part yields the same error. Solved for me.

anomalib/utils/metrics/adaptive_threshold.py line 40

@archg2021
Copy link

@birdortyedi can you please share what exactly did you need to change to get the custom data into "correct format". Facing the same error and unable to find what needs to be fixed..

@birdortyedi
Copy link

@archg2021 I changed the folder structure of my dataset as exactly what's in the their dataset. e.g. folder names like "broken_large", "good" etc.

@archg2021
Copy link

archg2021 commented Mar 14, 2022

@birdortyedi still does not seem to work. My directory structure looks like this :

datasets/MVTec/dummy
├── ground_truth
│ └── broken_large
│ └── broken_small
├── test
│ ├── broken_large
│ ├── broken_small
│ └── good
└── train
└── good

@samet-akcay
Copy link
Contributor

@archg2021 what's the error you're getting? Same as above?

IndexError: too many indices for tensor of dimension 0

And can you show the image filenames? Especially the ground-truth? Ground-truth filenames must end with ..._mask.png

We're working on a CustomDataset class that handles custom datasets. We aim to merge it this week

@archg2021
Copy link

@samet-akcay First of all thanks a lot for the amazing work. Yes I get the same error.
The image file names are in the date-time format. Ground truth masks also contain the _mask.png` suffix. The dataset folder structure looks as follows:

datasets/MVTec/dummy
├── ground_truth
│   ├── broken_large
│   │   ├── 20211124-1251-31_mask.png
│   │   ├── .
│   │   ├── .
│   │   └── 20211124-1625-21_mask.png
│   └── broken_small
│   ├── 20211125-0805-40_mask.png
│   ├── .
│   ├── .
│   └── 20211125-1354-40_mask.png
├── test
│   ├── broken_large
│   │   ├── 20211124-1251-31.png
│   │   ├── .
│   │   ├── .
│   │   └── 20211124-1625-21.png
│   ├── broken_small
│   │   ├── 20211125-0806-34.png
│   │   ├── 20211125-0807-13.png
│   │   ├── .
│   │   ├── .
│   │   └── 20211125-1327-39.png
│   ├── good
│   │   ├── 20211124-1240-08.png
│   │   ├── .
│   │   ├── .
│   │   └── 20211124-1541-05.png
│   └── jpg2png.py
└── train
└── good
├── 20211124-1241-05.png
├── .
├── .
└── 20211124-1541-19.png

@archg2021
Copy link

archg2021 commented Mar 16, 2022

@samet-akcay Can you please help me fix this error or provide some suggestions!

@samet-akcay
Copy link
Contributor

samet-akcay commented Mar 16, 2022

@archg2021, I unfortunately couldn't reproduce it here. What I would do is to debug it here. Samples should follow the following format as shown in the example

We aim to add the custom dataset support by the end of the week. You could also try that once it's merged.

@samet-akcay
Copy link
Contributor

hi @archg2021, there is a PR #154 that adds custom dataset support. If you want to test it out before it's merged you could test it from its feature branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants