-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIMO #1726
base: main
Are you sure you want to change the base?
PIMO #1726
Changes from 53 commits
c3944eb
b61c47f
e6006b4
101d646
62d5480
0f0b424
8df211e
9e74226
283d704
2bc3c06
c864b54
a3d1060
bfc287e
60483fc
4bfe3da
1a7398b
403b4ae
b12fb86
b7e5439
3808de8
dfa8dc3
2cefa2c
adc14fd
c30c4ea
408fb2b
43c6eb2
980c972
a052f6a
dabba4a
215847b
a6404fc
14c97fa
5bc3b2b
b8e0ddf
943c1a7
2e565d1
103b6db
08b85ef
6165768
fbdf8b6
f558904
581b35b
5837c0d
68a30aa
d4071ad
2f65040
0d0863f
fdd797a
012e8e2
f11b4a9
fa448f2
dd0bd50
c92a6a9
1cde9ce
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
"""Per-Image Metrics.""" | ||
|
||
# Original Code | ||
# https://github.com/jpcbertoldo/aupimo | ||
# | ||
# Modified | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
from .binclf_curve import per_image_binclf_curve, per_image_fpr, per_image_tpr | ||
from .binclf_curve_numpy import BinclfAlgorithm, BinclfThreshsChoice | ||
from .pimo import AUPIMO, PIMO, AUPIMOResult, PIMOResult, aupimo_scores, pimo_curves | ||
from .pimo_numpy import PIMOSharedFPRMetric | ||
from .utils import ( | ||
compare_models_pairwise_ttest_rel, | ||
compare_models_pairwise_wilcoxon, | ||
format_pairwise_tests_results, | ||
per_image_scores_stats, | ||
) | ||
from .utils_numpy import StatsOutliersPolicy, StatsRepeatedPolicy | ||
|
||
__all__ = [ | ||
# constants | ||
"BinclfAlgorithm", | ||
"BinclfThreshsChoice", | ||
"StatsOutliersPolicy", | ||
"StatsRepeatedPolicy", | ||
"PIMOSharedFPRMetric", | ||
# result classes | ||
"PIMOResult", | ||
"AUPIMOResult", | ||
# functional interfaces | ||
"per_image_binclf_curve", | ||
"per_image_fpr", | ||
"per_image_tpr", | ||
"pimo_curves", | ||
"aupimo_scores", | ||
# torchmetrics interfaces | ||
"PIMO", | ||
"AUPIMO", | ||
# utils | ||
"compare_models_pairwise_ttest_rel", | ||
"compare_models_pairwise_wilcoxon", | ||
"format_pairwise_tests_results", | ||
"per_image_scores_stats", | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
"""Binary classification matrix curve (NUMBA implementation of low level functions). | ||
jpcbertoldo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Details: `.binclf_curve`. | ||
""" | ||
|
||
# Original Code | ||
# https://github.com/jpcbertoldo/aupimo | ||
# | ||
# Modified | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import numba | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am still on the fence regarding the numba requirement. While it is a good feature to have, it increases complexity. I would recommend dropping it from Anomalib. Any user who wants to use it can refer to the original repo. Any thoughts @samet-akcay @djdameln ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm also a bit hesitant to add the numba requirement. I can see the benefit that it brings, but at the same time it adds an unnecessary dependency and it increases the complexity of the code. Without Numba we could have a pure pytorch implementation of the metric, which would be much cleaner and more in line with the rest of the library. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the pytorch-only was slower than the one with numpy (which is slower than numba) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jpcbertoldo would it be possible to use the pytorch-only implementation by default, and use the Numba implementation if the user has Numba installed in their environment? This way we don't force casual users to install another dependency, and advanced users who care more about speed can install Numba to speed up the computation. We could log a warning to notify the users that the computation can be made faster by installing Numba. |
||
import numpy as np | ||
from numpy import ndarray | ||
|
||
|
||
@numba.jit(nopython=True) | ||
def binclf_one_curve_numba(scores: ndarray, gts: ndarray, threshs: ndarray) -> ndarray: | ||
jpcbertoldo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""One binary classification matrix at each threshold (NUMBA implementation). | ||
|
||
This does the same as `_binclf_one_curve_python` but with numba using just-in-time compilation. | ||
|
||
Note: VALIDATION IS NOT DONE HERE! Make sure to validate the arguments before calling this function. | ||
|
||
Args: | ||
scores (ndarray): Anomaly scores (D,). | ||
gts (ndarray): Binary (bool) ground truth of shape (D,). | ||
threshs (ndarray): Sequence of thresholds in ascending order (K,). | ||
|
||
Returns: | ||
ndarray: Binary classification matrix curve (K, 2, 2) | ||
|
||
Details: `anomalib.metrics.per_image.binclf_curve_numpy.binclf_multiple_curves`. | ||
""" | ||
num_th = len(threshs) | ||
|
||
# POSITIVES | ||
scores_pos = scores[gts] | ||
# the sorting is very important for the algorithm to work and the speedup | ||
scores_pos = np.sort(scores_pos) | ||
# start counting with lowest th, so everything is predicted as positive (this variable is updated in the loop) | ||
num_pos = current_count_tp = len(scores_pos) | ||
|
||
tps = np.empty((num_th,), dtype=np.int64) | ||
|
||
# NEGATIVES | ||
# same thing but for the negative samples | ||
scores_neg = scores[~gts] | ||
scores_neg = np.sort(scores_neg) | ||
num_neg = current_count_fp = len(scores_neg) | ||
|
||
fps = np.empty((num_th,), dtype=np.int64) | ||
|
||
# it will progressively drop the scores that are below the current th | ||
for thidx, th in enumerate(threshs): | ||
num_drop = 0 | ||
num_scores = len(scores_pos) | ||
while num_drop < num_scores and scores_pos[num_drop] < th: # ! scores_pos ! | ||
num_drop += 1 | ||
# --- | ||
scores_pos = scores_pos[num_drop:] | ||
current_count_tp -= num_drop | ||
tps[thidx] = current_count_tp | ||
|
||
# same with the negatives | ||
num_drop = 0 | ||
num_scores = len(scores_neg) | ||
while num_drop < num_scores and scores_neg[num_drop] < th: # ! scores_neg ! | ||
num_drop += 1 | ||
# --- | ||
scores_neg = scores_neg[num_drop:] | ||
current_count_fp -= num_drop | ||
fps[thidx] = current_count_fp | ||
|
||
fns = num_pos * np.ones((num_th,), dtype=np.int64) - tps | ||
tns = num_neg * np.ones((num_th,), dtype=np.int64) - fps | ||
|
||
# sequence of dimensions is (threshs, true class, predicted class) (see docstring) | ||
return np.stack( | ||
( | ||
np.stack((tns, fps), axis=-1), | ||
np.stack((fns, tps), axis=-1), | ||
), | ||
axis=-1, | ||
).transpose(0, 2, 1) | ||
|
||
|
||
@numba.jit(nopython=True, parallel=True) | ||
def binclf_multiple_curves_numba(scores_batch: ndarray, gts_batch: ndarray, threshs: ndarray) -> ndarray: | ||
"""Multiple binary classification matrix at each threshold (NUMBA implementation). | ||
|
||
This does the same as `_binclf_multiple_curves_python` but with numba, | ||
using parallelization and just-in-time compilation. | ||
|
||
Note: VALIDATION IS NOT DONE HERE. Make sure to validate the arguments before calling this function. | ||
|
||
Args: | ||
scores_batch (ndarray): Anomaly scores (N, D,). | ||
gts_batch (ndarray): Binary (bool) ground truth of shape (N, D,). | ||
threshs (ndarray): Sequence of thresholds in ascending order (K,). | ||
|
||
Returns: | ||
ndarray: Binary classification matrix curves (N, K, 2, 2) | ||
|
||
Details: `anomalib.metrics.per_image.binclf_curve_numpy.binclf_multiple_curves`. | ||
""" | ||
num_imgs = scores_batch.shape[0] | ||
num_th = len(threshs) | ||
ret = np.empty((num_imgs, num_th, 2, 2), dtype=np.int64) | ||
for imgidx in numba.prange(num_imgs): | ||
scoremap = scores_batch[imgidx] | ||
mask = gts_batch[imgidx] | ||
ret[imgidx] = binclf_one_curve_numba(scoremap, mask, threshs) | ||
return ret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this would be the best place to define this, as PIMO is the only module that uses Numba
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont personally dont see the difference, it's kind of arbitrary, no?
should i move it there?