Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIMO #1726

Open
wants to merge 54 commits into
base: main
Choose a base branch
from
Open

PIMO #1726

wants to merge 54 commits into from

Conversation

jpcbertoldo
Copy link
Contributor

@jpcbertoldo jpcbertoldo commented Feb 9, 2024

πŸ“ Description

Replace #1557, which replaces the PRs from https://gist.github.com/jpcbertoldo/12553b7eaa97cfbf3e55bfd7d1cafe88 .

Implements refactors from https://github.com/jpcbertoldo/anomalib/blob/metrics/refactors/src/anomalib/utils/metrics/perimg/.refactors .

arxiv: https://arxiv.org/abs/2401.01984
medium post: https://medium.com/p/c653ac30e802
GSoC deliverable: https://gist.github.com/jpcbertoldo/12553b7eaa97cfbf3e55bfd7d1cafe88

Closes #1728 1728

✨ Changes

Select what type of change your PR is:

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • πŸ”¨ Refactor (non-breaking change which refactors the code base)
  • πŸš€ New feature (non-breaking change which adds functionality)
  • πŸ’₯ Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • πŸ“š Documentation update
  • πŸ”’ Security update

βœ… Checklist

Before you submit your pull request, please make sure you have completed the following steps:

  • πŸ“‹ I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
  • πŸ“š I have made the necessary updates to the documentation (if applicable).
  • πŸ§ͺ I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

@jpcbertoldo
Copy link
Contributor Author

another unresolved issue from the previous PR

[about "ATTENTION..." in docstrings]

@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

@samet-akcay
Copy link
Contributor

samet-akcay commented Feb 9, 2024

another unresolved issue from the previous PR

[about "ATTENTION..." in docstrings]

@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes
image

@jpcbertoldo
Copy link
Contributor Author

another unresolved issue from the previous PR
[about "ATTENTION..." in docstrings]
@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

there are some [metadata] stuff showing up, idk why

apparently sphinx doesnt like dataclasses?
https://anomalib--1726.org.readthedocs.build/en/1726/markdown/guides/reference/metrics/index.html#id5
this field is in the class AUPIMOResult but it's showing as if it was a function or w/e in the root (?)
while AUPIMOResult doesnt show at all

@samet-akcay
Copy link
Contributor

another unresolved issue from the previous PR
[about "ATTENTION..." in docstrings]
@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

there are some [metadata] stuff showing up, idk why

apparently sphinx doesnt like dataclasses? https://anomalib--1726.org.readthedocs.build/en/1726/markdown/guides/reference/metrics/index.html#id5 this field is in the class AUPIMOResult but it's showing as if it was a function or w/e in the root (?) while AUPIMOResult doesnt show at all

Tree structure is also messed up a bit. It might be an idea to split each metric into a separate section.

@jpcbertoldo
Copy link
Contributor Author

another unresolved issue from the previous PR
[about "ATTENTION..." in docstrings]
@ashwinvaidya17

Same here. We need to consider how these docstrings will be rendered in sphinx.

how can i check that?

The documentation is built here based on your changes

https://anomalib--1726.org.readthedocs.build/en/1726/markdown/guides/reference/metrics/index.html

it's not quite working as expected

  1. i expected per_image to show as submenu in metrics, how could I do that?

  2. it seems not to like dataclasses; there are attributes of PIMOResult and AUPIMOResult showing as if it was a function (?) and the classes themselves dont' show

@samet-akcay samet-akcay added this to the v1.1.0 milestone Feb 29, 2024
@samet-akcay samet-akcay added Feature and removed Dependencies Pull requests that update a dependency file Tests labels Mar 25, 2024
@samet-akcay samet-akcay modified the milestones: v1.1.0, v1.2.0 May 14, 2024
Copy link
Collaborator

@ashwinvaidya17 ashwinvaidya17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot has changed since this PR was submitted, but I've finally gotten around to reviewing it. This is a huge PR with a lot of efforts behind it. However, I have some concerns. I've gone over it once but I think I'll require a few more passes for a more thorough review. But meanwhile we can start the discussions for the current opens.

src/anomalib/metrics/per_image/binclf_curve.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/_binclf_curve_numba.py Outdated Show resolved Hide resolved
requirements/base.txt Outdated Show resolved Hide resolved
tests/unit/metrics/per_image/test_utils.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/_validate.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/utils.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/binclf_curve_numpy.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/pimo_numpy.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/pimo_numpy.py Show resolved Hide resolved
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Copy link
Contributor Author

@jpcbertoldo jpcbertoldo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfinished topics:

  1. copyright in headers: i removed the copyright mention from my repo and just left the link to it; i kept the one from intel, but can someone confirm about this comment and this other comment, please?
  2. about the computation method here, but i'm quite sure that the current implementation was the fastest i figured out

src/anomalib/metrics/per_image/_validate.py Outdated Show resolved Hide resolved
tests/unit/metrics/per_image/test_utils.py Outdated Show resolved Hide resolved
src/anomalib/metrics/per_image/utils.py Outdated Show resolved Hide resolved
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
@jpcbertoldo jpcbertoldo mentioned this pull request Jun 7, 2024
10 tasks
Copy link
Collaborator

@ashwinvaidya17 ashwinvaidya17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think we are almost there. I have two major comments remaining. First, the author tag in the headers is not consistent with the rest of the repo. The current scans check third-party-programs.txt to verify the license and proper attribution for codes taken from other repositories. Second, numba adds complexity to the codebase. If it is not strictly necessary, let's not include it. But I am happy to hear other opinions.

tests/unit/metrics/per_image/__init__.py Show resolved Hide resolved
src/anomalib/metrics/per_image/__init__.py Outdated Show resolved Hide resolved
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import numba
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still on the fence regarding the numba requirement. While it is a good feature to have, it increases complexity. I would recommend dropping it from Anomalib. Any user who wants to use it can refer to the original repo. Any thoughts @samet-akcay @djdameln ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit hesitant to add the numba requirement. I can see the benefit that it brings, but at the same time it adds an unnecessary dependency and it increases the complexity of the code. Without Numba we could have a pure pytorch implementation of the metric, which would be much cleaner and more in line with the rest of the library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pytorch-only was slower than the one with numpy (which is slower than numba)
i see this is becoming a long issue, i can just drop numba, but can you confirm this is the decision to make? (just to avoid going back later)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpcbertoldo would it be possible to use the pytorch-only implementation by default, and use the Numba implementation if the user has Numba installed in their environment? This way we don't force casual users to install another dependency, and advanced users who care more about speed can install Numba to speed up the computation. We could log a warning to notify the users that the computation can be made faster by installing Numba.

Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
Signed-off-by: jpcbertoldo <24547377+jpcbertoldo@users.noreply.github.com>
@jpcbertoldo
Copy link
Contributor Author

Thanks! I think we are almost there. I have two major comments remaining. First, the author tag in the headers is not consistent with the rest of the repo. The current scans check third-party-programs.txt to verify the license and proper attribution for codes taken from other repositories. Second, numba adds complexity to the codebase. If it is not strictly necessary, let's not include it. But I am happy to hear other opinions.

I removed the author tag and add the original code's repo to third-party-programs.txt.

I think the decision for numba is rather up to y`all as maintainers, but it does give a nice speedup.

For the record (if i get it well), it only has two requirements:

install_requires = [
    'llvmlite >={},<{}'.format(min_llvmlite_version, max_llvmlite_version),
    'numpy >={},<{}'.format(min_numpy_run_version, max_numpy_run_version),
    'importlib_metadata; python_version < "3.9"',
]

https://github.com/numba/numba/blob/d4460feb8c91213e7b89f97b632d19e34a776cd3/setup.py#L369-L373

min_numpy_run_version = "1.22"
max_numpy_run_version = "1.27"
min_llvmlite_version = "0.41.0dev0"
max_llvmlite_version = "0.42"

https://github.com/numba/numba/blob/d4460feb8c91213e7b89f97b632d19e34a776cd3/setup.py#L25-L28

I copied these from numba==0.58.1, which is the minimum here.

Copy link
Contributor

@djdameln djdameln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting the following error when I try to use the AUPIMO metric to evaluate an anomalib model. Can you check it out?

ValueError: The `.compute()` return of the metric logged as 'pixel_AUPIMO' must be a tensor. Found (PIMOResult(shared_fpr_metric='mean-per-image-fpr'), AUPIMOResult(shared_fpr_metric='mean-per-image-fpr', fpr_lower_bound=1e-05, fpr_upper_bound=0.0001, num_threshs=48480))

Comment on lines +27 to +30
class PIMOSharedFPRMetric(Enum):
"""Shared FPR metric (x-axis of the PIMO curve)."""

MEAN_PERIMAGE_FPR: str = "mean-per-image-fpr"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an enum for this if there is only one possible value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is kind of a heritage from old experiments where there was another possible value
but i kept it because this value shows in the .yaml file where the scores are saved
we can remove it from everywhere if that makes more sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of removing it. If we extend the metric with more possible values in the future we can always bring it back

src/anomalib/metrics/per_image/pimo.py Outdated Show resolved Hide resolved
Comment on lines +11 to +16
try:
import numba # noqa: F401
except ImportError:
HAS_NUMBA = False
else:
HAS_NUMBA = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this would be the best place to define this, as PIMO is the only module that uses Numba

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont personally dont see the difference, it's kind of arbitrary, no?
should i move it there?

# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import numba
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also a bit hesitant to add the numba requirement. I can see the benefit that it brings, but at the same time it adds an unnecessary dependency and it increases the complexity of the code. Without Numba we could have a pure pytorch implementation of the metric, which would be much cleaner and more in line with the rest of the library.

@jpcbertoldo
Copy link
Contributor Author

I'm getting the following error when I try to use the AUPIMO metric to evaluate an anomalib model. Can you check it out?

ValueError: The `.compute()` return of the metric logged as 'pixel_AUPIMO' must be a tensor. Found (PIMOResult(shared_fpr_metric='mean-per-image-fpr'), AUPIMOResult(shared_fpr_metric='mean-per-image-fpr', fpr_lower_bound=1e-05, fpr_upper_bound=0.0001, num_threshs=48480))

This is normal because aupimo returns many values so it was encapsulated in that dataclass.
We could create an option in the torchmetrics interface to optionally return this and (by default) return just the average aupimo instead.
Sounds good?

@djdameln
Copy link
Contributor

djdameln commented Jul 9, 2024

This is normal because aupimo returns many values so it was encapsulated in that dataclass.
We could create an option in the torchmetrics interface to optionally return this and (by default) return just the average aupimo instead.
Sounds good?

I think that would be a good idea. With the default setting the metric should be fully compatible with Anomalib's pipeline. So users should be able to enable it from the config/cli or API to have Anomalib report the average AUPIMO value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add PIMO metric to anomalib
4 participants