💥 Create a script to upgrade v0.* configuration format to v1 (#1738)

* Create migration tool * Remove excluded keys from data arg overwrites * Fix CLI arg to ensure the enum vs str type * Update the config script * Add tests * Create a migration guide in the docs * update migration documentation * Fix albumentation tests * Fix albumentation tests
openvinotoolkit · Feb 20, 2024 · 78e0ace · 78e0ace
1 parent 789a185
commit 78e0ace
Show file tree

Hide file tree

Showing 11 changed files with 785 additions and 7 deletions.
diff --git a/docs/source/index.md b/docs/source/index.md
@@ -89,6 +89,7 @@ Learn how to develop and contribute to anomalib.
 :hidden:
 
 markdown/get_started/anomalib
+markdown/get_started/migration
 ```
 
 ```{toctree}

diff --git a/docs/source/markdown/get_started/migration.md b/docs/source/markdown/get_started/migration.md
@@ -0,0 +1,164 @@
+# Migrating from 0.\* to 1.0
+
+## Overview
+
+The 1.0 release of the Anomaly Detection Library (AnomalyLib) introduces several
+changes to the library. This guide provides an overview of the changes and how
+to migrate from 0.\* to 1.0.
+
+## Installation
+
+For installation instructions, refer to the [installation guide](anomalib.md).
+
+## Changes to the CLI
+
+### Upgrading the Configuration
+
+There are several changes to the configuration of Anomalib. The configuration
+file has been updated to include new parameters and remove deprecated parameters.
+In addition, some parameters have been moved to different sections of the
+configuration.
+
+Anomalib provides a python script to update the configuration file from 0.\* to 1.0.
+To update the configuration file, run the following command:
+
+```bash
+python tools/upgrade/config.py \
+    --input_config <path_to_0.*_config> \
+    --output_config <path_to_1.0_config>
+```
+
+This script will ensure that the configuration file is updated to the 1.0 format.
+
+In the following sections, we will discuss the changes to the configuration file
+in more detail.
+
+### Changes to the Configuration File
+
+#### Data
+
+The `data` section of the configuration file has been updated such that the args
+can be directly used to instantiate the data object. Below are the differences
+between the old and new configuration files highlighted in a markdown diff format.
+
+```diff
+-dataset:
++data:
+-  name: mvtec
+-  format: mvtec
++  class_path: anomalib.data.MVTec
++  init_args:
+-  path: ./datasets/MVTec
++    root: ./datasets/MVTec
+     category: bottle
+     image_size: 256
+     center_crop: null
+     normalization: imagenet
+     train_batch_size: 72
+     eval_batch_size: 32
+     num_workers: 8
+     task: segmentation
+     test_split_mode: from_dir # options: [from_dir, synthetic]
+     test_split_ratio: 0.2 # fraction of train images held out testing (usage depends on test_split_mode)
+     val_split_mode: same_as_test # options: [same_as_test, from_test, synthetic]
+     val_split_ratio: 0.5 # fraction of train/test images held out for validation (usage depends on val_split_mode)
+     seed: null
+-  transform_config:
+-    train: null
+-    eval: null
++    transform_config_train: null
++    transform_config_eval: null
+-  tiling:
+-    apply: false
+-    tile_size: null
+-    stride: null
+-    remove_border_count: 0
+-    use_random_tiling: False
+-    random_tile_count: 16+data:
+```
+
+Here is the summary of the changes to the configuration file:
+
+- The `name` and `format keys` from the old configuration are absent in the new
+  configuration, possibly integrated into the design of the class at `class_path`.
+- Introduction of a `class_path` key in the new configuration specifies the Python
+  class path for data handling.
+- The structure has been streamlined in the new configuration, moving everything
+  under `data` and `init_args` keys, simplifying the hierarchy.
+- `transform_config` keys were split into `transform_config_train` and
+  `transform_config_eval` to clearly separate training and evaluation configurations.
+- The `tiling` section present in the old configuration has been completely
+  removed in the new configuration. v1.0.0 does not support tiling. This feature
+  will be added back in a future release.
+
+#### Model
+
+Similar to data configuration, the `model` section of the configuration file has
+been updated such that the args can be directly used to instantiate the model object.
+Below are the differences between the old and new configuration files highlighted
+in a markdown diff format.
+
+```diff
+ model:
+-  name: patchcore
+-  backbone: wide_resnet50_2
+-  pre_trained: true
+-  layers:
++  class_path: anomalib.models.Patchcore
++  init_args:
++    backbone: wide_resnet50_2
++    pre_trained: true
++    layers:
+     - layer2
+     - layer3
+-  coreset_sampling_ratio: 0.1
+-  num_neighbors: 9
++    coreset_sampling_ratio: 0.1
++    num_neighbors: 9
+-  normalization_method: min_max # options: [null, min_max, cdf]
++normalization:
++  normalization_method: min_max
+```
+
+Here is the summary of the changes to the configuration file:
+
+- Model Identification: Transition from `name` to `class_path` for specifying
+  the model, indicating a more explicit reference to the model's implementation.
+- Initialization Structure: Introduction of `init_args` to encapsulate model
+  initialization parameters, suggesting a move towards a more structured and
+  possibly dynamically loaded configuration system.
+- Normalization Method: The `normalization_method` key is removed from the `model`
+  section and moved to a separate `normalization` section in the new configuration.
+
+#### Metrics
+
+The `metrics` section of the configuration file has been updated such that the
+args can be directly used to instantiate the metrics object. Below are the differences
+between the old and new configuration files highlighted in a markdown diff format.
+
+```diff
+metrics:
+  image:
+    - F1Score
+    - AUROC
+  pixel:
+     - F1Score
+     - AUROC
+   threshold:
+-    method: adaptive #options: [adaptive, manual]
+-    manual_image: null
+-    manual_pixel: null
++    class_path: anomalib.metrics.F1AdaptiveThreshold
++    init_args:
++      default_value: 0.5
+```
+
+Here is the summary of the changes to the configuration file:
+
+- Metric Identification: Transition from `method` to `class_path` for specifying
+  the metric, indicating a more explicit reference to the metric's implementation.
+- Initialization Structure: Introduction of `init_args` to encapsulate metric initialization
+  parameters, suggesting a move towards a more structured and possibly dynamically
+  loaded configuration system.
+- Threshold Method: The `method` key is removed from the `threshold` section and
+  moved to a separate `class_path` section in the new configuration.
diff --git a/src/anomalib/cli/cli.py b/src/anomalib/cli/cli.py
@@ -137,7 +137,7 @@ def add_arguments_to_parser(self, parser: LightningArgumentParser) -> None:
         parser.add_argument("--visualization.save", type=bool, default=False)
         parser.add_argument("--visualization.log", type=bool, default=False)
         parser.add_argument("--visualization.show", type=bool, default=False)
-        parser.add_argument("--task", type=TaskType, default=TaskType.SEGMENTATION)
+        parser.add_argument("--task", type=TaskType | str, default=TaskType.SEGMENTATION)
         parser.add_argument("--metrics.image", type=list[str] | str | None, default=["F1Score", "AUROC"])
         parser.add_argument("--metrics.pixel", type=list[str] | str | None, default=None, required=False)
         parser.add_argument("--metrics.threshold", type=BaseThreshold | str, default="F1AdaptiveThreshold")

diff --git a/src/anomalib/data/utils/transforms.py b/src/anomalib/data/utils/transforms.py
@@ -120,7 +120,7 @@ def get_transforms(
         # load transforms from config file
         elif isinstance(config, str):
             logger.info("Reading transforms from Albumentations config file: %s.", config)
-            transforms = A.load(filepath=config, data_format="yaml")
+            transforms = A.load(filepath_or_buffer=config, data_format="yaml")
         elif isinstance(config, A.Compose):
             logger.info("Transforms loaded from Albumentations Compose object")
             transforms = config

diff --git a/src/anomalib/models/__init__.py b/src/anomalib/models/__init__.py
@@ -20,6 +20,7 @@
     Dfkde,
     Dfm,
     Draem,
+    Dsr,
     EfficientAd,
     Fastflow,
     Ganomaly,
@@ -62,7 +63,7 @@ class UnknownModelError(ModuleNotFoundError):
 logger = logging.getLogger(__name__)
 
 
-def _convert_pascal_to_snake_case(pascal_case: str) -> str:
+def convert_pascal_to_snake_case(pascal_case: str) -> str:
     """Convert PascalCase to snake_case.
 
     Args:
@@ -81,7 +82,7 @@ def _convert_pascal_to_snake_case(pascal_case: str) -> str:
     return re.sub(r"(?<!^)(?=[A-Z])", "_", pascal_case).lower()
 
 
-def _convert_snake_to_pascal_case(snake_case: str) -> str:
+def convert_snake_to_pascal_case(snake_case: str) -> str:
     """Convert snake_case to PascalCase.
 
     Args:
@@ -110,7 +111,7 @@ def get_available_models() -> set[str]:
         >>> get_available_models()
         ['ai_vad', 'cfa', 'cflow', 'csflow', 'dfkde', 'dfm', 'draem', 'efficient_ad', 'fastflow', ...]
     """
-    return {_convert_pascal_to_snake_case(cls.__name__) for cls in AnomalyModule.__subclasses__()}
+    return {convert_pascal_to_snake_case(cls.__name__) for cls in AnomalyModule.__subclasses__()}
 
 
 def _get_model_class_by_name(name: str) -> type[AnomalyModule]:
@@ -128,7 +129,7 @@ def _get_model_class_by_name(name: str) -> type[AnomalyModule]:
     logger.info("Loading the model.")
     model_class: type[AnomalyModule] | None = None
 
-    name = _convert_snake_to_pascal_case(name).lower()
+    name = convert_snake_to_pascal_case(name).lower()
     for model in AnomalyModule.__subclasses__():
         if name == model.__name__.lower():
             model_class = model

diff --git a/tests/integration/tools/upgrade/expected_draem_v1.yaml b/tests/integration/tools/upgrade/expected_draem_v1.yaml
@@ -0,0 +1,101 @@
+data:
+  class_path: anomalib.data.MVTec
+  init_args:
+    root: ./datasets/MVTec
+    category: bottle
+    image_size:
+      - 256
+      - 256
+    center_crop: null
+    normalization: imagenet
+    train_batch_size: 72
+    eval_batch_size: 32
+    num_workers: 8
+    task: segmentation
+    transform_config_train: null
+    transform_config_eval: null
+    test_split_mode: from_dir
+    test_split_ratio: 0.2
+    val_split_mode: same_as_test
+    val_split_ratio: 0.5
+    seed: null
+model:
+  class_path: anomalib.models.Draem
+  init_args:
+    enable_sspcab: false
+    sspcab_lambda: 0.1
+    anomaly_source_path: null
+    beta:
+      - 0.1
+      - 1.0
+normalization:
+  normalization_method: min_max
+metrics:
+  image:
+    - F1Score
+    - AUROC
+  pixel:
+    - F1Score
+    - AUROC
+  threshold:
+    class_path: anomalib.metrics.F1AdaptiveThreshold
+    init_args:
+      default_value: 0.5
+visualization:
+  visualizers: null
+  save: true
+  log: true
+  show: false
+logging:
+  log_graph: false
+seed_everything: true
+task: segmentation
+results_dir:
+  path: ./results
+  unique: false
+ckpt_path: null
+trainer:
+  accelerator: auto
+  strategy: auto
+  devices: 1
+  num_nodes: 1
+  precision: 32
+  logger: null
+  callbacks:
+    - class_path: lightning.pytorch.callbacks.EarlyStopping
+      init_args:
+        patience: 20
+        mode: max
+        monitor: pixel_AUROC
+  fast_dev_run: false
+  max_epochs: 1
+  min_epochs: null
+  max_steps: -1
+  min_steps: null
+  max_time: null
+  limit_train_batches: 1.0
+  limit_val_batches: 1.0
+  limit_test_batches: 1.0
+  limit_predict_batches: 1.0
+  overfit_batches: 0.0
+  val_check_interval: 1.0
+  check_val_every_n_epoch: 1
+  num_sanity_val_steps: 0
+  log_every_n_steps: 50
+  enable_checkpointing: true
+  enable_progress_bar: true
+  enable_model_summary: true
+  accumulate_grad_batches: 1
+  gradient_clip_val: 0
+  gradient_clip_algorithm: norm
+  deterministic: false
+  benchmark: false
+  inference_mode: true
+  use_distributed_sampler: true
+  profiler: null
+  detect_anomaly: false
+  barebones: false
+  plugins: null
+  sync_batchnorm: false
+  reload_dataloaders_every_n_epochs: 0
+  default_root_dir: null