-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with heatmap scaling when training without masks #173
Comments
What's your folder structure? In MVTec, it's my_dataset # root folder
bottle # object type
train # path: samples for training
good # folder : contains only good sample for train
test # path: sample for testing
good # folder: contain only good sample for test
defected_type_1 # folder: defected images
detected_type_2 # folder defected images
.....
ground_truth # folder: contains binary mask
defected_type_1_mask # folder: binary mask
defected_type_2_mask # folder: binary mask
..... |
Thanks for your reply! To rule out the folder dataset as the error source, I reproduced the same issue using the MVTec dataset and default model configs. Once, I train on the provided ground truth masks and once on zero masks. Again, when training on the zero mask, the heatmap is scaled differently. Should the output heatmap not be independent of the ground truth mask? From my understanding, it should only be used to compute the evaluation scores, correct? My zero masks are simply created with |
relevant issue.
I'm wondering how zero maks can play a role here. I'm trying to understand the workflow of the anomaly detection task. If I understand, we only need good samples for training And in testing time, there are good and bad samples with ground truth masks - like in the mv-tec dataset. In some cases, the program will adaptively pick a threshold. My custom dataset structure is as follows: bottle # folder name with object type
train # path: samples for training
good # folder: contains only good samples for train
test # path: sample for testing
good # folder: contain only good samples for the test
defected_type_1 # folder: defected images
ground_truth # folder: contains the binary mask
defected_type_1_mask # folder: binary mask I'm not sure how to change the config with this. For now, I change the config as follows, with the test set only; whereas it should be train set. dataset:
name: mvtec #options: [mvtec, btech, folder]
format: folder # mvtec
path: C:/Users/..../bottle/test # ./datasets/MVTec
normal: good
abnormal: defected_type_1
split_ratio: 0.2
seed: 0
task: segmentation
mask: C:/Users/..../bottle/ground_truth/defected_type_1
extensions: null
# category: bottle
image_size: 224
train_batch_size: 32
test_batch_size: 1
num_workers: 1 # 36
transform_config: null
create_validation_set: false
tiling:
apply: false # true
tile_size: null # 64
stride: null
remove_border_count: 0
use_random_tiling: False
random_tile_count: 16 If I don't have the ground truth, how should I change the config for the custom dataset? Also, any tips about the best practices to get optimal results? |
At test time, the model normalizes both the anomaly map and the score anomalib/anomalib/deploy/inferencers/torch.py Line 154 in 8c1a04f
The values used for this normalization are obtained during validation, but can also be provided by you in the meta data anomalib/anomalib/deploy/optimize.py Lines 30 to 55 in 8c1a04f
So the best way would be to check your models meta data and adjust if needed.. |
Hi @LukasBommes and @innat, it's because you guys set the adaptive threshold to If you want to use a custom dataset you should use folder dataset. Here is sample config how to use it |
@innat, if you don't have the mask annotations, you could set the dataset:
name: mvtec #options: [mvtec, btech, folder]
format: folder # mvtec
path: C:/Users/..../bottle/test # ./datasets/MVTec
normal: good
abnormal: defected_type_1
split_ratio: 0.2
seed: 0
->task: classification
->mask: null
extensions: null
# category: bottle
image_size: 224
train_batch_size: 32
test_batch_size: 1
num_workers: 1 # 36
transform_config: null
create_validation_set: false
tiling:
apply: false # true
tile_size: null # 64
stride: null
remove_border_count: 0
use_random_tiling: False
random_tile_count: 16 Since there is no mask annotations, the task would become |
@samet-akcay Thank you. My current dataset is kind of a mess (some have ground truth and some don't). I need to sort them out properly. Could you please inform me how should I update the config with the following folder structure? It's like mv-tec, good samples for training. And, good + defected samples for testing/evaluation. It has the ground truth binary mask. dummy_folder_name # folder name with object type
train # path: samples for training
good # folder: contains only good samples for train
test # path: sample for testing
good # folder: contain only good samples for the test
defected_type_1 # folder: defected images
defected_type_2 # folder: defected images
.....
ground_truth # folder: contains the binary mask
defected_type_1_mask # folder: binary mask
defected_type_2_mask # folder: binary mask
..... I mostly confuse the following params: name: any_name_ #options: [mvtec, btech, folder]
format: folder # mvtec
path: ?
normal: ?
abnormal: ?
task: segmentation
mask: ? Lastly, the |
I would place dummy_folder_name # folder name with object type
train # path: samples for training
good # folder: contains only good samples for train
test # path: sample for testing
good # folder: contain only good samples for the test
defect
defected_type_1 # folder: defected images
defected_type_2 # folder: defected images
.....
ground_truth # folder: contains the binary mask
defect
defected_type_1_mask # folder: binary mask
defected_type_2_mask # folder: binary mask
..... name: dummy_folder_name
format: folder
path: path/to/dummy_folder_name
normal: good
abnormal: defect
task: segmentation
mask: path/to/dummy_folder_name/ground_truth/defect I'm not 100% sure though, I need to double check this. Another alternative to folder dataset parameters would be to remove |
Okay, so in the config setup above name: dummy_folder_name
format: folder
path: path/to/dummy_folder_name
normal: good # i think it should be 'train/good'
abnormal: defect # and it should be 'test/defect`
task: segmentation
mask: path/to/dummy_folder_name/ground_truth/defect If so, where I should set the path for |
argh, just noticed it now. |
Examples in the docstring might be helpful to understand how |
Thanks for all your inputs on this and good to know that the issue with the heatmap is expected behaviour. I am also looking forward to your new method for setting the threshold in an unsupervised manner. @samet-akcay: I tried a custom folder dataset and set My dataset structure is as follows:
and this is the complete config
|
Thanks for fixing this. I'll give it a try in the afternoon. |
I ran into an issue when training on a dataset (subset of MVTec), where I set all ground truth masks to zero (to simulate training on a dataset for which I have no ground truth masks). When training with the actual ground truth masks, the model produces heatmaps as expected as in the first image below (produced with
tools/inference.py
). However, when training with the zero masks, the heatmaps seem to be scaled differently as in the second image below. The confidence score seems unaffected.This behaviour is the same for both PADIM and PatchCore. I haven't tested the other models.
This is my model config for PADIM
The text was updated successfully, but these errors were encountered: