Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Algo] Attribute-based Representations for Accurate and Interpretable Video Anomaly Detection #1040

Merged
merged 62 commits into from
May 9, 2023
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
a6628f4
add implementation of ai-vad up to feature extraction
djdameln Mar 24, 2023
c0d2243
collect embeddings during training
djdameln Mar 24, 2023
23e2307
compute scores in validation step
djdameln Mar 27, 2023
6ecec01
full pipeline implemented
djdameln Mar 27, 2023
acd3e9f
move model components to separate files
djdameln Mar 29, 2023
5a1906e
move density estimation to separate file
djdameln Mar 30, 2023
a88793a
formatting
djdameln Mar 30, 2023
16ff446
make model configurable from config
djdameln Mar 30, 2023
6a21c92
add gaussian smoothing of temporal predictions
djdameln Apr 5, 2023
ac33e5a
regions from original implementation (WIP)
djdameln Apr 20, 2023
1fcd90c
add original regions
djdameln Apr 21, 2023
2915627
separate keypoint extraction from region extraction
djdameln Apr 21, 2023
28e4be4
make target frame configurable
djdameln Apr 21, 2023
9a34c87
add test for gt frame handling
djdameln Apr 21, 2023
e33beba
update changelog
djdameln Apr 21, 2023
145b888
fix shape inference in visualizer
djdameln Apr 24, 2023
d07c07a
Merge branch 'video-gt-handling' into ai-vad-experimental
djdameln Apr 24, 2023
27f4788
remove video gt handling from ai-vad implementation
djdameln Apr 24, 2023
161048b
configure target frame
djdameln Apr 24, 2023
9d6ddda
docstrings
djdameln Apr 24, 2023
d3349ff
remove pre-supplied bboxes region extractor
djdameln Apr 24, 2023
6c0f61b
merge keypoint extractor and pose extractor
djdameln Apr 24, 2023
98335eb
clean up feature extractor
djdameln Apr 24, 2023
6a90ebf
only extract features of specified type
djdameln Apr 24, 2023
452ad92
raise error when no feature type selected
djdameln Apr 24, 2023
0462ed3
update image score computation
djdameln Apr 25, 2023
47d3191
typing and docstrings
djdameln Apr 25, 2023
528fe67
merge main
djdameln Apr 25, 2023
54baebe
import sorting
djdameln Apr 25, 2023
340813f
add third pary clip implementation
djdameln Apr 26, 2023
d6520d7
update vocab file
djdameln Apr 26, 2023
8b24430
revert disable shuffle for training dataloader
djdameln Apr 26, 2023
1bf5ca2
Update src/anomalib/models/ai_vad/config.yaml
djdameln Apr 26, 2023
8d62764
remove reference to rdke
djdameln Apr 26, 2023
c766221
remove F1 from evaluation metrics
djdameln Apr 26, 2023
c1aeb83
Update src/anomalib/models/ai_vad/features.py
djdameln Apr 26, 2023
74e0e0d
Merge branch 'ai-vad' of https://github.com/djdameln/anomalib into ai…
djdameln Apr 26, 2023
e5c35b9
Update src/anomalib/models/ai_vad/density.py
djdameln Apr 26, 2023
f77d804
remove init
djdameln Apr 26, 2023
f2c20b1
change variable name
djdameln Apr 26, 2023
abfa244
ignore unused variable
djdameln Apr 26, 2023
e2a6df2
appearance -> deep
djdameln Apr 26, 2023
83fea19
Update src/anomalib/models/ai_vad/features.py
djdameln Apr 26, 2023
b1699c5
Merge branch 'ai-vad' of https://github.com/djdameln/anomalib into ai…
djdameln Apr 26, 2023
32d50e7
fix typo
djdameln Apr 26, 2023
7ec8921
Update src/anomalib/models/ai_vad/features.py
djdameln Apr 26, 2023
e3b978a
Update src/anomalib/models/ai_vad/features.py
djdameln Apr 26, 2023
8f821f4
improve readability
djdameln Apr 26, 2023
7db5ea9
Merge branch 'ai-vad' of https://github.com/djdameln/anomalib into ai…
djdameln Apr 26, 2023
0a635de
add todo
djdameln Apr 26, 2023
502e3a8
formatting
djdameln Apr 26, 2023
063f8d2
modify clip code to remove ftfy requirement
djdameln Apr 26, 2023
848e368
remove unused parts of clip implementation
djdameln Apr 26, 2023
4940ab5
fix warnings
djdameln Apr 26, 2023
30939f0
add clip to third party software list
djdameln Apr 26, 2023
74f2ec5
add missing variable to dosctring
djdameln Apr 26, 2023
f75e5c4
2022 -> 2023
djdameln Apr 26, 2023
1900131
set target frame parameter in all video datasets
djdameln Apr 26, 2023
7f67a47
update parameter name and fix typo
djdameln Apr 26, 2023
453eb27
address codacy issues
djdameln May 3, 2023
0f59bd1
codacy
djdameln May 3, 2023
435c108
Update config.yaml
djdameln May 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/anomalib/data/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ def get_datamodule(config: DictConfig | ListConfig) -> AnomalibDataModule:
task=config.dataset.task,
clip_length_in_frames=config.dataset.clip_length_in_frames,
frames_between_clips=config.dataset.frames_between_clips,
target_frame=config.dataset.target_frame,
image_size=(config.dataset.image_size[0], config.dataset.image_size[1]),
center_crop=center_crop,
normalization=config.dataset.normalization,
Expand All @@ -169,6 +170,7 @@ def get_datamodule(config: DictConfig | ListConfig) -> AnomalibDataModule:
task=config.dataset.task,
clip_length_in_frames=config.dataset.clip_length_in_frames,
frames_between_clips=config.dataset.frames_between_clips,
target_frame=config.dataset.target_frame,
image_size=(config.dataset.image_size[0], config.dataset.image_size[1]),
center_crop=center_crop,
normalization=config.dataset.normalization,
Expand Down Expand Up @@ -205,6 +207,7 @@ def get_datamodule(config: DictConfig | ListConfig) -> AnomalibDataModule:
task=config.dataset.task,
clip_length_in_frames=config.dataset.clip_length_in_frames,
frames_between_clips=config.dataset.frames_between_clips,
target_frame=config.dataset.target_frame,
image_size=(config.dataset.image_size[0], config.dataset.image_size[1]),
center_crop=center_crop,
normalization=config.dataset.normalization,
Expand Down
9 changes: 8 additions & 1 deletion src/anomalib/data/avenue.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from pandas import DataFrame

from anomalib.data.base import AnomalibVideoDataModule, AnomalibVideoDataset
from anomalib.data.base.video import VideoTargetFrame
from anomalib.data.task_type import TaskType
from anomalib.data.utils import (
DownloadInfo,
Expand Down Expand Up @@ -140,6 +141,7 @@ class AvenueDataset(AnomalibVideoDataset):
split (Split): Split of the dataset, usually Split.TRAIN or Split.TEST
clip_length_in_frames (int, optional): Number of video frames in each clip.
frames_between_clips (int, optional): Number of frames between each consecutive video clip.
target_frame (VideoTargetFrame): Specifies the target frame in the video clip, used for ground truth retrieval
"""

def __init__(
Expand All @@ -151,8 +153,9 @@ def __init__(
split: Split,
clip_length_in_frames: int = 1,
frames_between_clips: int = 1,
target_frame: VideoTargetFrame = VideoTargetFrame.LAST,
) -> None:
super().__init__(task, transform, clip_length_in_frames, frames_between_clips)
super().__init__(task, transform, clip_length_in_frames, frames_between_clips, target_frame)

self.root = root if isinstance(root, Path) else Path(root)
self.gt_dir = gt_dir if isinstance(gt_dir, Path) else Path(gt_dir)
Expand All @@ -172,6 +175,7 @@ class Avenue(AnomalibVideoDataModule):
gt_dir (Path | str): Path to the ground truth files
clip_length_in_frames (int, optional): Number of video frames in each clip.
frames_between_clips (int, optional): Number of frames between each consecutive video clip.
target_frame (VideoTargetFrame): Specifies the target frame in the video clip, used for ground truth retrieval
task TaskType): Task type, 'classification', 'detection' or 'segmentation'
image_size (int | tuple[int, int] | None, optional): Size of the input image.
Defaults to None.
Expand All @@ -198,6 +202,7 @@ def __init__(
gt_dir: Path | str,
clip_length_in_frames: int = 1,
frames_between_clips: int = 1,
target_frame: VideoTargetFrame = VideoTargetFrame.LAST,
task: TaskType = TaskType.SEGMENTATION,
image_size: int | tuple[int, int] | None = None,
center_crop: int | tuple[int, int] | None = None,
Expand Down Expand Up @@ -241,6 +246,7 @@ def __init__(
transform=transform_train,
clip_length_in_frames=clip_length_in_frames,
frames_between_clips=frames_between_clips,
target_frame=target_frame,
root=root,
gt_dir=gt_dir,
split=Split.TRAIN,
Expand All @@ -251,6 +257,7 @@ def __init__(
transform=transform_eval,
clip_length_in_frames=clip_length_in_frames,
frames_between_clips=frames_between_clips,
target_frame=target_frame,
root=root,
gt_dir=gt_dir,
split=Split.TEST,
Expand Down
9 changes: 8 additions & 1 deletion src/anomalib/data/shanghaitech.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from torch import Tensor

from anomalib.data.base import AnomalibVideoDataModule, AnomalibVideoDataset
from anomalib.data.base.video import VideoTargetFrame
from anomalib.data.task_type import TaskType
from anomalib.data.utils import (
DownloadInfo,
Expand Down Expand Up @@ -187,6 +188,7 @@ class ShanghaiTechDataset(AnomalibVideoDataset):
split (Split): Split of the dataset, usually Split.TRAIN or Split.TEST
clip_length_in_frames (int, optional): Number of video frames in each clip.
frames_between_clips (int, optional): Number of frames between each consecutive video clip.
target_frame (VideoTargetFrame): Specifies the target frame in the video clip, used for ground truth retrieval
"""

def __init__(
Expand All @@ -198,8 +200,9 @@ def __init__(
split: Split,
clip_length_in_frames: int = 1,
frames_between_clips: int = 1,
target_frame: VideoTargetFrame = VideoTargetFrame.LAST,
):
super().__init__(task, transform, clip_length_in_frames, frames_between_clips)
super().__init__(task, transform, clip_length_in_frames, frames_between_clips, target_frame)

self.root = root
self.scene = scene
Expand All @@ -219,6 +222,7 @@ class ShanghaiTech(AnomalibVideoDataModule):
scene (int): Index of the dataset scene (category) in range [1, 13]
clip_length_in_frames (int, optional): Number of video frames in each clip.
frames_between_clips (int, optional): Number of frames between each consecutive video clip.
target_frame (VideoTargetFrame): Specifies the target frame in the video clip, used for ground truth retrieval
task TaskType): Task type, 'classification', 'detection' or 'segmentation'
image_size (int | tuple[int, int] | None, optional): Size of the input image.
Defaults to None.
Expand All @@ -245,6 +249,7 @@ def __init__(
scene: int,
clip_length_in_frames: int = 1,
frames_between_clips: int = 1,
target_frame: VideoTargetFrame = VideoTargetFrame.LAST,
task: TaskType = TaskType.SEGMENTATION,
image_size: int | tuple[int, int] | None = None,
center_crop: int | tuple[int, int] | None = None,
Expand Down Expand Up @@ -288,6 +293,7 @@ def __init__(
transform=transform_train,
clip_length_in_frames=clip_length_in_frames,
frames_between_clips=frames_between_clips,
target_frame=target_frame,
root=root,
scene=scene,
split=Split.TRAIN,
Expand All @@ -298,6 +304,7 @@ def __init__(
transform=transform_eval,
clip_length_in_frames=clip_length_in_frames,
frames_between_clips=frames_between_clips,
target_frame=target_frame,
root=root,
scene=scene,
split=Split.TEST,
Expand Down
9 changes: 8 additions & 1 deletion src/anomalib/data/ucsd_ped.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from torch import Tensor

from anomalib.data.base import AnomalibVideoDataModule, AnomalibVideoDataset
from anomalib.data.base.video import VideoTargetFrame
from anomalib.data.task_type import TaskType
from anomalib.data.utils import (
DownloadInfo,
Expand Down Expand Up @@ -155,6 +156,7 @@ class UCSDpedDataset(AnomalibVideoDataset):
split (str | Split | None): Split of the dataset, usually Split.TRAIN or Split.TEST
clip_length_in_frames (int, optional): Number of video frames in each clip.
frames_between_clips (int, optional): Number of frames between each consecutive video clip.
target_frame (VideoTargetFrame): Specifies the target frame in the video clip, used for ground truth retrieval
"""

def __init__(
Expand All @@ -166,8 +168,9 @@ def __init__(
split: Split,
clip_length_in_frames: int = 1,
frames_between_clips: int = 1,
target_frame: VideoTargetFrame = VideoTargetFrame.LAST,
djdameln marked this conversation as resolved.
Show resolved Hide resolved
) -> None:
super().__init__(task, transform, clip_length_in_frames, frames_between_clips)
super().__init__(task, transform, clip_length_in_frames, frames_between_clips, target_frame)

self.root_category = Path(root) / category
self.split = split
Expand All @@ -186,6 +189,7 @@ class UCSDped(AnomalibVideoDataModule):
category (str): Sub-category of the dataset, e.g. 'bottle'
clip_length_in_frames (int, optional): Number of video frames in each clip.
frames_between_clips (int, optional): Number of frames between each consecutive video clip.
target_frame (VideoTargetFrame): Specifies the target frame in the video clip, used for ground truth retrieval
task (TaskType): Task type, 'classification', 'detection' or 'segmentation'
image_size (int | tuple[int, int] | None, optional): Size of the input image.
Defaults to None.
Expand Down Expand Up @@ -215,6 +219,7 @@ def __init__(
category: str,
clip_length_in_frames: int = 1,
frames_between_clips: int = 1,
target_frame: VideoTargetFrame = VideoTargetFrame.LAST,
task: TaskType = TaskType.SEGMENTATION,
image_size: int | tuple[int, int] | None = None,
center_crop: int | tuple[int, int] | None = None,
Expand Down Expand Up @@ -258,6 +263,7 @@ def __init__(
transform=transform_train,
clip_length_in_frames=clip_length_in_frames,
frames_between_clips=frames_between_clips,
target_frame=target_frame,
root=root,
category=category,
split=Split.TRAIN,
Expand All @@ -268,6 +274,7 @@ def __init__(
transform=transform_eval,
clip_length_in_frames=clip_length_in_frames,
frames_between_clips=frames_between_clips,
target_frame=target_frame,
root=root,
category=category,
split=Split.TEST,
Expand Down
3 changes: 3 additions & 0 deletions src/anomalib/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from omegaconf import DictConfig, ListConfig
from torch import load

from anomalib.models.ai_vad import AiVad
from anomalib.models.cfa import Cfa
from anomalib.models.cflow import Cflow
from anomalib.models.components import AnomalyModule
Expand Down Expand Up @@ -41,6 +42,7 @@
"ReverseDistillation",
"Rkde",
"Stfpm",
"AiVad",
]

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -92,6 +94,7 @@ def get_model(config: DictConfig | ListConfig) -> AnomalyModule:
"reverse_distillation",
"rkde",
"stfpm",
"ai_vad",
]
model: AnomalyModule

Expand Down
13 changes: 13 additions & 0 deletions src/anomalib/models/ai_vad/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""Implementatation of the AI-VAD Model.

AI-VAD: Accurate and Interpretable Video Anomaly Detection

Paper https://arxiv.org/pdf/2212.00789.pdf
"""

# Copyright (C) 2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

from .lightning_model import AiVad, AiVadLightning

__all__ = ["AiVad", "AiVadLightning"]
21 changes: 21 additions & 0 deletions src/anomalib/models/ai_vad/clip/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License
djdameln marked this conversation as resolved.
Show resolved Hide resolved

Copyright (c) 2021 OpenAI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Empty file.
Loading