Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸš€ Add support for MVTec LOCO dataset and sPRO metric #1686

Merged

Conversation

willyfh
Copy link
Contributor

@willyfh willyfh commented Jan 27, 2024

πŸ“ Description

This PR is based on the closed PR #1635 , which adds the implementation of MVTec LOCO AD dataset [Paper] along with the sPRO metric.

πŸ› οΈ Fixes #574 #471 #1341

✨ Changes

Select what type of change your PR is:

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • πŸ”¨ Refactor (non-breaking change which refactors the code base)
  • πŸš€ New feature (non-breaking change which adds functionality)
  • πŸ’₯ Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • πŸ“š Documentation update
  • πŸ”’ Security update

βœ… Checklist

Before you submit your pull request, please make sure you have completed the following steps:

  • πŸ“‹ I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
  • πŸ“š I have made the necessary updates to the documentation (if applicable).
  • πŸ§ͺ I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

@willyfh willyfh changed the title πŸš€ v1 - Add support for MVTec LOCO dataset and sPRO metric πŸš€ Add support for MVTec LOCO dataset and sPRO metric Jan 30, 2024
@samet-akcay
Copy link
Contributor

@willyfh, is this ready for the review?

@willyfh
Copy link
Contributor Author

willyfh commented Jan 31, 2024

@willyfh, is this ready for the review?

@samet-akcay I have implemented the MVTec Loco dataset and SPRO metric, and I have added the unit test for both. However, I haven't implemented the AUSPRO yet. Should AUSPRO be addressed later in a separate PR? If so, I think this PR is ready for review. :)

@phcarval
Copy link
Contributor

phcarval commented Feb 1, 2024

@willyfh , I would be very interested in getting the AUsPRO metric working. I would be happy to collaborate with you on this one if you're willing to :)

@samet-akcay
Copy link
Contributor

@willyfh, is this ready for the review?

@samet-akcay I have implemented the MVTec Loco dataset and SPRO metric, and I have added the unit test for both. However, I haven't implemented the AUSPRO yet. Should AUSPRO be addressed later in a separate PR? If so, I think this PR is ready for review. :)

This PR is already big, I think it would also be easier for the reviewers if you auspro is a separate one

@willyfh
Copy link
Contributor Author

willyfh commented Feb 1, 2024

This PR is already big, I think it would also be easier for the reviewers if you auspro is a separate one

@samet-akcay ah, I see. Okay, I will mark this PR as 'Ready for review' after updating the changelog soon. πŸ‘

@willyfh willyfh force-pushed the mvtec-loco-ad-dataset branch 4 times, most recently from 030b067 to 009eb60 Compare February 1, 2024 12:51
@willyfh willyfh marked this pull request as ready for review February 1, 2024 13:01
@willyfh
Copy link
Contributor Author

willyfh commented Feb 1, 2024

@willyfh , I would be very interested in getting the AUsPRO metric working. I would be happy to collaborate with you on this one if you're willing to :)

@phcarval Ah, sorry I missed your comment. If you want, you can implement the auspro metric and create the PR. I'm getting very busy at work these days, so I don't have much time. 😿

@willyfh
Copy link
Contributor Author

willyfh commented Feb 1, 2024

Now this PR is ready for review :) @samet-akcay

@samet-akcay
Copy link
Contributor

@jpcbertoldo, @phcarval, @blaz-r , do you guys have any concerns about this pr

@jpcbertoldo
Copy link
Contributor

@jpcbertoldo, @phcarval, @blaz-r , do you guys have any concerns about this pr

I'll have sometime to check on wed only 😬

@blaz-r
Copy link
Contributor

blaz-r commented Feb 5, 2024

@jpcbertoldo, @phcarval, @blaz-r , do you guys have any concerns about this pr

I can check it tomorrow.

@phcarval
Copy link
Contributor

phcarval commented Feb 5, 2024

Will check tomorrow or wednesday.

Copy link
Contributor

@samet-akcay samet-akcay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for adding this support, @willyfh! I've got some questions and comments. I'll try to go for another round of review later.

src/anomalib/data/image/mvtec_loco.py Outdated Show resolved Hide resolved
src/anomalib/data/image/mvtec_loco.py Outdated Show resolved Hide resolved
src/anomalib/data/image/mvtec_loco.py Outdated Show resolved Hide resolved
src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
tests/helpers/data.py Show resolved Hide resolved
src/anomalib/metrics/collection.py Outdated Show resolved Hide resolved
Comment on lines 100 to 101
if hasattr(trainer.datamodule, "saturation_config"):
self._set_saturation_config(pl_module, trainer.datamodule.saturation_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to somehow handle this within the SPRO itself? This is metric specific, would be great if we could sort it out in the metric

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samet-akcay do you mean loading the config directly in the metric? i think it would be possible as long as we introduce a new parameter in the yaml, so that we can load the config in the metric. I will do as follows if it is okay.

metrics:
  pixel:
    - AUROC
    - SPRO
  saturation_config: ./datasets/MVTec_LOCO/breakfast_box/defects_config.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh but this approach means that changing the category in the data config (.yaml) is not enough. because we also need to change the model (metric) config, i.e., the saturation_config path.

Copy link
Contributor

@samet-akcay samet-akcay Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwinvaidya17, can you confirm whether we could use class_path and init_args for metrics above? The implementation would be quite straightforward in case it is possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samet-akcay @ashwinvaidya17 In the latest implementation, I have handled the loading of the saturation config in the metric itself. You can see what I modified in this commit: d9a2333.

I found there is an existing function, i.e., metric_collection_from_dicts, which allows us to use class_path and init_args by passing a dict[str, dict[str, Any]] to the function

https://github.com/openvinotoolkit/anomalib/blob/main/src/anomalib/metrics/__init__.py#L184-L186

However, the current CLI doesn't support the dict type. So I modified this part by adding the dict type to make it work:

parser.add_argument(
"--metrics.pixel",
type=list[str] | str | dict[str, dict[str, Any]] | None,
default=None,
required=False,
)

I have tested this configuration in the YAML file, and the saturation_config path can be loaded on the metric side:

metrics:
  pixel:
    SPRO:
      class_path: anomalib.metrics.SPRO
      init_args:
        saturation_config: ./datasets/MVTec_LOCO/pushpins/defects_config.json

@willyfh
Copy link
Contributor Author

willyfh commented Feb 6, 2024

Thanks, @samet-akcay, for your comments and questions. I've provided responses to them. For some, before moving forward with any changes, I'd appreciate your confirmation on certain points. Also, I would greatly appreciate feedback from you guys @jpcbertoldo, @phcarval, and @blaz-r. Thanks a lot in advance for your help!

Copy link
Contributor

@blaz-r blaz-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this impressive work πŸ˜„. I left some comments and questions.

src/anomalib/metrics/collection.py Outdated Show resolved Hide resolved
src/anomalib/data/image/mvtec_loco.py Outdated Show resolved Hide resolved
src/anomalib/data/image/mvtec_loco.py Outdated Show resolved Hide resolved
src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
@willyfh
Copy link
Contributor Author

willyfh commented Feb 6, 2024

@blaz-r thanks for your comments and questions! There is an important point that would be changed based on the discussions. The merged tensor for the multi masks, i.e., masks, seems can't handle all cases. (see: #1686 (comment) and #1686 (comment)). So we could just simply pass a list of masks without merging. But for the binary mask, i.e., mask, we may still need to merge it for the usage in the other metrics. cc: @samet-akcay @jpcbertoldo @phcarval

Copy link
Contributor

@jpcbertoldo jpcbertoldo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willyfh i didn't througly review because it looks like there will an essential change about dealing with the multiple masks of a single image (which will eventually change interfaces as well).

I left comments on this topic, but overall it seems like there will be a list[list[tensor]] being passed around.

I'm imagining a solution like this (just a scratch):
https://gist.github.com/jpcbertoldo/b766da23f45a4117b428940764c50de3

Which has a weird effect: a batch with n predictions can spit out >= n spro curves (1 per mask, but each image can have 1+ masks), but so does PRO.
However it should be ok for SPRO because it is averaged on the batch dimension before the AUC computation.

One major point is that going through these masks with for loops will be slow.
The good news is that #1726 will have a method that is much faster for computing the TPR (ie. overlap, recall) curves.

If you manage to do something like the approach in my scratch above, it will be compatible with the code coming from there and we'll be able to accelerate the computation here later.


Disclaimer: Raising another topic that i'm not necessarily in favor of adressing for the sake of getting this PR through.

MVTecLoco assumes that each anomaly type (which have different labels) define the saturation parameter of each image. This strategy is rather tight because one could define image-specific saturation.

Again, not necessary to cover this here, but having it in mind can't hurt IMO.


@willyfh i know this dataset is very painful but thanks for the effort, you're doing great!

src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
Comment on lines +14 to +17
test_split_mode: FROM_DIR
test_split_ratio: 0.2
val_split_mode: FROM_DIR
val_split_ratio: 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When FROM_DIR used , isn't *_split_ratio ignored?
If yes, i guess it should be null?
LOCO specifically defines a fixed val set to avoid these splits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using 'null' might lead to ambiguity with the 'none' option, no? As far as I know, 'none' is supposed to be used to skip the validation/testing stage. You can refer to this issue context:#1629.

I thought 'FROM_DIR' would be suitable since in the case of 'test_split_mode', as also used by other datasets like MVTec, etc., the FROM_DIR option retrieves the test sample from a fixed test set. Even if FROM_DIR is used, the split ratio actually ignored because the test set already provides both normal and abnormal samples.

But yeah, the *_split_ratio is quite confusing because it is actually ignored. So it might be better to remove the split ratio from the yaml file.

test_split_mode: FROM_DIR
test_split_ratio: 0.2
val_split_mode: SAME_AS_TEST
val_split_ratio: 0.5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_split_mode: FROM_DIR and val_split_mode: FROM_DIR is fine, i was just referring to this:

if FROM_DIR is used, the split ratio actually ignored because the test set already provides both normal and abnormal samples.

So it might be better to remove the split ratio from the yaml file.

@ashwinvaidya17 is that the convention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah i see, I misunderstood your comments. Yeah, i wonder why the test_split_ratio and val_split_ratio are provided in the config file of MVTec and some other datasets even though it seems that those are ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_split_ratio parameters is used in the following cases:

  • When the test_split_mode is FROM_DIR and there are no normal images in the provided folder. In this case the test_split_ratio determines the number of normal images that are sampled from the training set and moved to the test set.
  • When the test split mode is SYNTHETIC. In this case the normal images are sampled from the training set and converted to anomalous by applying random noise augmentations.

The val_split_ratio is used in the following cases:

  • When the val_split_mode is FROM_TEST. In this case the split ratio determines the number of normal and anomalous sampled that are randomly sampled from the test set and moved to the validation set.
  • When the val_split_mode is SYNTHETIC, in the same way as for the test set.

By default MVTec uses TestSplitMode.FROM_DIR and ValSplitMode.SAME_AS_TEST, but other configurations are possible as well. The split ratio parameters are included in the config to make the users aware that these exist and can be changed. We could consider to set them default as null and raise a warning when a value is provided but is ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could consider to set them default as null and raise a warning when a value is provided but is ignored.

@djdameln Got it. Should we just set them as null in this PR? I mean, this part raise a warning when a value is provided but is ignored., I think out of the scope of this PR since it is actually the problem of all existing datasets, right? Not only for this specific dataset.

…ectory

Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
willyfh and others added 10 commits February 9, 2024 12:46
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Co-authored-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Co-authored-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Co-authored-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
…ataset

Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
… number of masks

Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
@willyfh
Copy link
Contributor Author

willyfh commented Feb 10, 2024

@jpcbertoldo thanks for your comments! I have provided responses to them. I have managed to pass list of tensor from the dataset to the metrics. I have few questions.

I'm imagining a solution like this (just a scratch):
https://gist.github.com/jpcbertoldo/b766da23f45a4117b428940764c50de3

Is this solution also needed for the the current sPRO (spro.py) metric? or for the AU-sPRO (auspro.py) metric which will be implemented later in a separate PR? Currently the spro.py only provides the spro computation for a single threshold (no curve), which is similar with pro.py for the PRO metric.

MVTecLoco assumes that each anomaly type (which have different labels) define the saturation parameter of each image. This strategy is rather tight because one could define image-specific saturation.

Does this mean that each mask of each image is annotated with its own saturation parameter? I wonder if there is any existing dataset that defines such image-specific saturation? If so, maybe we can "upgrade" the implementation to accommodate such dataset when the dataset is implemented later πŸ˜…

Copy link
Contributor

@jpcbertoldo jpcbertoldo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have managed to pass list of tensor from the dataset to the metrics.

You're right, list[Tensor] should do it (instead of list[list[Tensor]] as I had proposed).


Is this solution also needed for the the current sPRO (spro.py) metric? or for the AU-sPRO (auspro.py) metric which will be implemented later in a separate PR? Currently the spro.py only provides the spro computation for a single threshold (no curve), which is similar with pro.py for the PRO metric.

Ok , i didn't realize this was also the approach for PRO. I was a surprised to see it is defined as the score at a single threshold because it is always used a curve (threhold vs. score) in the end, and what actually matters is its AUC (the AU(s(PRO)).

As this is already the design of PRO, i think it is ok follow it for the sake of consistency.


Does this mean that each mask of each image is annotated with its own saturation parameter? I wonder if there is any existing dataset that defines such image-specific saturation?

To be clear: LOCO does not annotated per mask; each mask's saturation is deduced from its label.

I was just raising the possibility of that happening, but I don't think this PR should cover that case.

Comment on lines +123 to +124
# Resize the prediction to have the same size as the target mask
predictions = torch.nn.functional.interpolate(predictions.unsqueeze(1), targets[0].shape[-2:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not happen inside a metric IMO because it will modify the input without the user knowing it
a user should make sure that shapes are the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpcbertoldo Ah yes, I agree the resizing process should be outside of the metric. Regarding this resizing process, there is a thing that I'm not really sure about the standard or correct procedure to handle this.

(a) Should we resize the predictions to match the size of the original mask? (b) Or should we resize the original mask to match the size of the prediction?

For the existing mask, it seems the mask is the one being resized to match the prediction, i.e., it depends on the image_size parameter in the config, right? The resizing process happens in __getitem__.

transformed = self.transform(image=image, mask=mask)

However, I noticed the way MVTec LOCO is evaluated here (I'm not sure if this is the common procedure or not): the prediction is resized back to the original size during inference, i.e., following the size of the original mask. Then, resized predictions are evaluated on the original mask size.

If (b), maybe we can simply remove this interpolation and let the transform in __get_item__ transform the masks to match the prediction size. If (a), maybe we need to resize the prediction back to the original size in this post_process, along with providing a warning to let the user know the image size will be resized back to the original mask during evaluation.
https://github.com/openvinotoolkit/anomalib/blob/main/src/anomalib/callbacks/post_processor.py#L97

@samet-akcay @ashwinvaidya17 @blaz-r

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Resizing the mask to the (reduced) image size might lead to information loss, which might in turn lead to inaccurate evaluation metrics, especially in the case of small regions in the GT masks. So ideally the predictions should be resized to the original image and mask size before computing the metrics. This will probably require some changes to various components which might be outside of the scope of this PR.

For now, I would propose to use the same convention for the SPRO metric as for the other metrics, which is to resize the masks and evaluate at the user-defined image size. We can make the switch to evaluation at the original size in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to use the same convention for the SPRO metric as for the other metrics, which is to resize the masks and evaluate at the user-defined image size. We can make the switch to evaluation at the original size in a separate PR.

@djdameln I agree with this. I will resize the masks and evaluate at the user-defined image size πŸ‘

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @djdameln said, it's probably a good idea to make it simple for the sake of finishing this PR, but that's not ideal. In our latest paper (AUPIMO http://arxiv.org/abs/2401.01984) we specifically talk about this actually : ) because authors often use different ways to resize the masks and that creates artifacts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpcbertoldo I see, found this recommendation in your paper

Our recommendation is to apply bilinear interpolation to upsample anomaly score maps and evaluate
at the original resolution in each image.

So it seems we really need to address the evaluation at the original size. Let's do it in a separate PR for all datasets and all metrics πŸ˜…

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a todo note + GitHub issue somewhere so that we don't forget it.

src/anomalib/metrics/spro.py Outdated Show resolved Hide resolved
src/anomalib/metrics/collection.py Show resolved Hide resolved
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
Signed-off-by: Willy Fitra Hendria <willyfitrahendria@gmail.com>
@phcarval
Copy link
Contributor

@willyfh : Sorry I don't have much time to do an in-depth review of your PR, but I see a lot of great comments already. Very nice job anyway!

@@ -164,6 +170,9 @@ def _create_val_split(self) -> None:
# converted from random training sample
self.train_data, normal_val_data = random_split(self.train_data, self.val_split_ratio, seed=self.seed)
self.val_data = SyntheticAnomalyDataset.from_dataset(normal_val_data)
elif self.val_split_mode == ValSplitMode.FROM_DIR:
# the val_data is prepared in subclass
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to problems with thresholding and normalization, as the validation set of MVTec Loco only contains normal images by default. For accurate computation of the adaptive threshold and normalization statistics we would have to add some anomalous images here. Not sure what would be the best approach here though, since we do want to stick to the predefined splits for reproducibility.

Copy link
Contributor Author

@willyfh willyfh Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do want to stick to the predefined splits for reproducibility.

@djdameln Yeah, actually, this is why the FROM_DIR option was implemented in the first place, to use the predefined validation split as it is. However, if a user doesn't want to use this option for some reason, or perhaps for the reason you mentioned, can they freely change it to same_as_test or synthetic?

if "masks" in elem:
# collate masks and mask_path as list
out_dict["masks"] = [item.pop("masks") for item in batch]
out_dict["mask_path"] = [item.pop("mask_path") for item in batch]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? The default collate function should already collate the mask paths as a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djdameln In the MVTec LOCO dataset, a single image could be paired with multiple mask paths stored as a list, where the number of mask paths for each image can vary. When using the default collate function, it will raise an error because the Torch stack function cannot stack items with different lengths. The code is needed to avoid such error.

+---+---------------+-------+---------+-------------------------+-----------------------------+-------------+
| | path | split | label | image_path | mask_path | label_index |
+===+===============+=======+=========+===============+=======================================+=============+
| 0 | datasets/name | test | defect | path/to/image/file.png | [path/to/masks/file.png] | 1 |
+---+---------------+-------+---------+-------------------------+-----------------------------+-------------+

parser.add_argument("--metrics.pixel", type=list[str] | str | None, default=None, required=False)
parser.add_argument(
"--metrics.pixel",
type=list[str] | str | dict[str, dict[str, Any]] | None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djdameln This change is needed to address this comment #1686 (comment) .

I provided a more additional explanation here #1686 (comment)

Basically, we want to use class_path and init_args, so that we can pass saturation_config to the SPRO metric directly. via this config:

metrics:
  pixel:
    SPRO:
      class_path: anomalib.metrics.SPRO
      init_args:
        saturation_config: ./datasets/MVTec_LOCO/pushpins/defects_config.json

update = False
for name, metric in pixel_metric.items(copy_state=False):
if isinstance(metric, SPRO):
metric.update(torch.squeeze(output["anomaly_maps"]), output["masks"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be handled inside the metric so that we don't need metric-specific logic here?

Copy link
Contributor Author

@willyfh willyfh Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djdameln this approach was actually to address this comment #1686 (comment) . I provided a more detailed explanation in this comment #1686 (comment)

My initial approach was as follows:

pixel_metric.update( 
         torch.squeeze(output["anomaly_maps"]), 
         torch.squeeze(output["mask"].int()), 
         masks=torch.squeeze(output["masks"]) if "masks" in output else None, 
     ) 

My initial approach can handle all metrics (including SPRO and others) without metric-specific logic in the callbacks.metrics.py file. However, the problem is that we need an additional argument, _:

def update(self, predictions: torch.Tensor, _: torch.Tensor, masks: torch.Tensor) -> None:

So, I changed the solution to avoid using _ by incorporating metric-specific logic. Is there a better idea to avoid metric-specific logic while also avoiding the use of _ at the same time?

src/anomalib/metrics/collection.py Show resolved Hide resolved
Comment on lines +123 to +124
# Resize the prediction to have the same size as the target mask
predictions = torch.nn.functional.interpolate(predictions.unsqueeze(1), targets[0].shape[-2:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Resizing the mask to the (reduced) image size might lead to information loss, which might in turn lead to inaccurate evaluation metrics, especially in the case of small regions in the GT masks. So ideally the predictions should be resized to the original image and mask size before computing the metrics. This will probably require some changes to various components which might be outside of the scope of this PR.

For now, I would propose to use the same convention for the SPRO metric as for the other metrics, which is to resize the masks and evaluate at the user-defined image size. We can make the switch to evaluation at the original size in a separate PR.

Comment on lines +14 to +17
test_split_mode: FROM_DIR
test_split_ratio: 0.2
val_split_mode: FROM_DIR
val_split_ratio: 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_split_ratio parameters is used in the following cases:

  • When the test_split_mode is FROM_DIR and there are no normal images in the provided folder. In this case the test_split_ratio determines the number of normal images that are sampled from the training set and moved to the test set.
  • When the test split mode is SYNTHETIC. In this case the normal images are sampled from the training set and converted to anomalous by applying random noise augmentations.

The val_split_ratio is used in the following cases:

  • When the val_split_mode is FROM_TEST. In this case the split ratio determines the number of normal and anomalous sampled that are randomly sampled from the test set and moved to the validation set.
  • When the val_split_mode is SYNTHETIC, in the same way as for the test set.

By default MVTec uses TestSplitMode.FROM_DIR and ValSplitMode.SAME_AS_TEST, but other configurations are possible as well. The split ratio parameters are included in the config to make the users aware that these exist and can be changed. We could consider to set them default as null and raise a warning when a value is provided but is ignored.

@samet-akcay samet-akcay added this to the v1.1.0 milestone Feb 29, 2024
Copy link
Collaborator

@ashwinvaidya17 ashwinvaidya17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the efforts! Apologies for the late review. I am fine with the changes. Just some minor comments.

parser.add_argument("--metrics.pixel", type=list[str] | str | None, default=None, required=False)
parser.add_argument(
"--metrics.pixel",
type=list[str] | str | dict[str, dict[str, Any]] | None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the same typing change to metrics.image?

@@ -182,16 +181,37 @@ def _update_metrics(
image_metric.update(output["pred_scores"], output["label"].int())
if "mask" in output and "anomaly_maps" in output:
pixel_metric.to(self.device)
pixel_metric.update(torch.squeeze(output["anomaly_maps"]), torch.squeeze(output["mask"].int()))
if "masks" in output:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR but we need to be careful with these names as we might forget why we have mask and masks as two separate keys. And it might not be clear to new users

Comment on lines +123 to +124
# Resize the prediction to have the same size as the target mask
predictions = torch.nn.functional.interpolate(predictions.unsqueeze(1), targets[0].shape[-2:])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a todo note + GitHub issue somewhere so that we don't forget it.

@samet-akcay samet-akcay changed the base branch from main to feature/mvtec-loco April 9, 2024 08:39
@djdameln djdameln merged commit c1b7a28 into openvinotoolkit:feature/mvtec-loco Apr 9, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants