openvinotoolkit · samet-akcay · Jun 8, 2022 · Apr 26, 2022 · Apr 26, 2022 · Apr 26, 2022
diff --git a/README.md b/README.md
@@ -179,6 +179,26 @@ python tools/inference.py \
 
 > Ensure that you provide path to `meta_data.json` if you want the normalization to be applied correctly.
 
+## Hyperparameter Optimization
+
+To run hyperparameter optimization, use the following command:
+
+```bash
+python tools/hpo/wandb_sweep.py --model padim --model_config ./path_to_config.yaml --sweep_config tools/hpo/sweep.yaml"
+```
+
+For more details refer the [HPO Documentation](https://openvinotoolkit.github.io/anomalib/guides/hyperparameter_optimization.html)
+
+## Benchmarking
+
+To gather benchmarking data such as throughput across categories, use the following command:
+
+```bash
+python tools/benchmarking/benchmark.py --config <relative/absolute path>/<paramfile>.yaml
+```
+
+Refer to the [Benchmarking Documentation](https://openvinotoolkit.github.io/anomalib/guides/benchmarking.html) for more details.
+
 ___
 
 ## Datasets

diff --git a/anomalib/models/cflow/config.yaml b/anomalib/models/cflow/config.yaml
@@ -49,8 +49,11 @@ metrics:
 project:
   seed: 0
   path: ./results
-  log_images_to: [local]
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 # PL Trainer Args. Don't add extra parameter here.
 trainer:

diff --git a/anomalib/models/dfkde/config.yaml b/anomalib/models/dfkde/config.yaml
@@ -33,8 +33,11 @@ metrics:
 project:
   seed: 42
   path: ./results
-  log_images_to: []
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 # PL Trainer Args. Don't add extra parameter here.
 trainer:

diff --git a/anomalib/models/dfm/config.yaml b/anomalib/models/dfm/config.yaml
@@ -37,8 +37,11 @@ metrics:
 project:
   seed: 42
   path: ./results
-  log_images_to: []
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 # PL Trainer Args. Don't add extra parameter here.
 trainer:

diff --git a/anomalib/models/ganomaly/config.yaml b/anomalib/models/ganomaly/config.yaml
@@ -52,8 +52,11 @@ metrics:
 project:
   seed: 0
   path: ./results
-  log_images_to: []
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 optimization:
   openvino:

diff --git a/anomalib/models/padim/config.yaml b/anomalib/models/padim/config.yaml
@@ -44,8 +44,11 @@ metrics:
 project:
   seed: 42
   path: ./results
-  log_images_to: ["local"]
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 optimization:
   openvino:

diff --git a/anomalib/models/patchcore/config.yaml b/anomalib/models/patchcore/config.yaml
@@ -46,8 +46,11 @@ metrics:
 project:
   seed: 0
   path: ./results
-  log_images_to: [local]
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 # PL Trainer Args. Don't add extra parameter here.
 trainer:

diff --git a/anomalib/models/stfpm/config.yaml b/anomalib/models/stfpm/config.yaml
@@ -52,8 +52,11 @@ metrics:
 project:
   seed: 0
   path: ./results
-  log_images_to: [local]
-  logger: false # options: [tensorboard, wandb, csv] or combinations.
+
+logging:
+  log_images_to: ["local"] # options: [wandb, tensorboard, local]. Make sure you also set logger with using wandb or tensorboard.
+  logger: [] # options: [tensorboard, wandb, csv] or combinations.
+  log_graph: false # Logs the model graph to respective logger.
 
 optimization:
   openvino:

diff --git a/anomalib/utils/callbacks/__init__.py b/anomalib/utils/callbacks/__init__.py
@@ -15,6 +15,7 @@
 # and limitations under the License.
 
 import os
+import warnings
 from importlib import import_module
 from typing import List, Union
 
@@ -23,6 +24,7 @@
 from pytorch_lightning.callbacks import Callback, ModelCheckpoint
 
 from .cdf_normalization import CdfNormalizationCallback
+from .graph import GraphLogger
 from .min_max_normalization import MinMaxNormalizationCallback
 from .model_loader import LoadModelCallback
 from .timer import TimerCallback
@@ -76,7 +78,16 @@ def get_callbacks(config: Union[ListConfig, DictConfig]) -> List[Callback]:
         else:
             raise ValueError(f"Normalization method not recognized: {config.model.normalization_method}")
 
-    if not config.project.log_images_to == []:
+    # TODO Modify when logger is deprecated from project
+    if "log_images_to" in config.project.keys():
+        warnings.warn(
+            "'log_images_to' key will be deprecated from 'project' section of the config file."
+            " Please use the logging section in config file",
+            DeprecationWarning,
+        )
+        config.logging.log_images_to = config.project.log_images_to
+
+    if not config.logging.log_images_to == []:
         callbacks.append(
             VisualizerCallback(
                 task=config.dataset.task, inputs_are_normalized=not config.model.normalization_method == "none"
@@ -109,4 +120,8 @@ def get_callbacks(config: Union[ListConfig, DictConfig]) -> List[Callback]:
                 )
             )
 
+    # Add callback to log graph to loggers
+    if config.logging.log_graph not in [None, False]:
+        callbacks.append(GraphLogger())
+
     return callbacks
diff --git a/anomalib/utils/callbacks/graph.py b/anomalib/utils/callbacks/graph.py
@@ -0,0 +1,52 @@
+"""Log model graph to respective logger."""
+
+# Copyright (C) 2022 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions
+# and limitations under the License.
+
+import torch
+from pytorch_lightning import Callback, LightningModule, Trainer
+
+from anomalib.utils.loggers import AnomalibTensorBoardLogger, AnomalibWandbLogger
+
+
+class GraphLogger(Callback):
+    """Log model graph to respective logger."""
+
+    def on_train_start(self, trainer: Trainer, pl_module: LightningModule) -> None:
+        """Log model graph to respective logger.
+
+        Args:
+            trainer: Trainer object which contans reference to loggers.
+            pl_module: LightningModule object which is logged.
+        """
+
+        for logger in trainer.loggers:
+            if isinstance(logger, AnomalibWandbLogger):
+                # NOTE: log graph gets populated only after one backward pass. This won't work for models which do not
+                # require training such as Padim
+                logger.watch(pl_module, log_graph=True, log="all")
+                break
+
+    def on_train_end(self, trainer: Trainer, pl_module: LightningModule) -> None:
+        """Unwatch model if configured for wandb.
+
+        Args:
+            trainer: Trainer object which contans reference to loggers.
+            pl_module: LightningModule object which is logged.
+        """
+
+        for logger in trainer.loggers:
+            if isinstance(logger, AnomalibTensorBoardLogger):
+                logger.log_graph(pl_module, input_array=torch.ones((1, 3, 256, 256)))
+                break
diff --git a/anomalib/utils/loggers/__init__.py b/anomalib/utils/loggers/__init__.py
@@ -16,6 +16,7 @@
 
 import logging
 import os
+import warnings
 from typing import Iterable, List, Union
 
 from omegaconf.dictconfig import DictConfig
@@ -74,36 +75,55 @@ def get_experiment_logger(
     Returns:
         Union[LightningLoggerBase, Iterable[LightningLoggerBase], bool]: Logger
     """
-    if config.project.logger in [None, False]:
+
+    # TODO remove when logger is deprecated from project
+    if "logger" in config.project.keys():
+        warnings.warn(
+            "'logger' key will be deprecated from 'project' section of the config file."
+            " Please use the logging section in config file.",
+            DeprecationWarning,
+        )
+        if "logging" not in config:
+            config.logging = {"logger": config.project.logger, "log_graph": False}
+        else:
+            config.logging.logger = config.project.logger
+
+    if config.logging.logger in [None, False]:
         return False
 
     logger_list: List[LightningLoggerBase] = []
-    if isinstance(config.project.logger, str):
-        config.project.logger = [config.project.logger]
+    if isinstance(config.logging.logger, str):
+        config.logging.logger = [config.logging.logger]
 
-    for logger in config.project.logger:
+    for logger in config.logging.logger:
         if logger == "tensorboard":
             logger_list.append(
                 AnomalibTensorBoardLogger(
                     name="Tensorboard Logs",
                     save_dir=os.path.join(config.project.path, "logs"),
+                    log_graph=config.logging.log_graph,
                 )
             )
         elif logger == "wandb":
             wandb_logdir = os.path.join(config.project.path, "logs")
             os.makedirs(wandb_logdir, exist_ok=True)
+            name = (
+                config.model.name
+                if "category" not in config.dataset.keys()
+                else f"{config.dataset.category} {config.model.name}"
+            )
             logger_list.append(
                 AnomalibWandbLogger(
                     project=config.dataset.name,
-                    name=f"{config.dataset.category} {config.model.name}",
+                    name=name,
                     save_dir=wandb_logdir,
                 )
             )
         elif logger == "csv":
             logger_list.append(CSVLogger(save_dir=os.path.join(config.project.path, "logs")))
         else:
             raise UnknownLogger(
-                f"Unknown logger type: {config.project.logger}. "
+                f"Unknown logger type: {config.logging.logger}. "
                 f"Available loggers are: {AVAILABLE_LOGGERS}.\n"
                 f"To enable the logger, set `project.logger` to `true` or use one of available loggers in config.yaml\n"
                 f"To disable the logger, set `project.logger` to `false`."

diff --git a/docs/source/guides/benchmarking.md b/docs/source/guides/benchmarking.md
@@ -0,0 +1,33 @@
+# Benchmarking
+
+To add to the suit of experiment tracking and optimization, anomalib also includes a benchmarking script for gathering results across different combinations of models, their parameters, and dataset categories. The model performance and throughputs are logged into a csv file that can also serve as a means to track model drift. Optionally, these same results can be logged to Weights and Biases and TensorBoard. A sample configuration file is shown below.
+
+```yaml
+seed: 42
+compute_openvino: false
+hardware:
+  - cpu
+  - gpu
+writer:
+  - wandb
+  - tensorboard
+grid_search:
+  dataset:
+    category:
+      - bottle
+      - cable
+      - capsule
+      - carpet
+    image_size: [224]
+  model_name:
+    - padim
+    - patchcore
+```
+
+This configuration computes the throughput and performance metrics on CPU and GPU for four categories of the MVTec dataset for Padim and PatchCore models. The dataset can be configured in the respective model configuration files. By default, `compute_openvino` is set to `False` to support instances where OpenVINO requirements are not installed in the environment. Once installed, this flag can be set to True to get throughput on OpenVINO optimized models. The `writer` parameter is optional and can be set to `writer: []` in case the user only requires a csv file without logging to each respective logger. It is a good practice to set a value of seed to ensure reproducibility across runs and thus, is set to a non-zero value by default.
+
+Once a configuration is decided, benchmarking can easily be performed by calling
+```bash
+python tools/benchmarking/benchmark.py --config <relative/absolute path>/<paramfile>.yaml
+```
+A nice feature about the provided benchmarking script is that if the host system has multiple GPUs, the runs are parallelized over all the available GPUs for faster collection of result.
diff --git a/docs/source/guides/hyperparameter_optimization.md b/docs/source/guides/hyperparameter_optimization.md
@@ -0,0 +1,48 @@
+# Hyperparameter Optimization
+
+The default configuration for the models will not always work on a new dataset. Additionally, to increase performance, learning rate, optimizers, activation functions, etc. need to be tuned/selected. To make it easier to run such broad experiments to isolate the right combination of hyperparameters, Anomalib supports hyperparameter optimization using weights and biases.
+
+## YAML file
+A sample configuration file for hyperparameter optimization is provided at `tools/hpo/sweep.yaml` and is reproduced below:
+
+```yaml
+observation_budget: 10
+method: bayes
+metric:
+  name: pixel_AUROC
+  goal: maximize
+parameters:
+  dataset:
+    category: capsule
+    image_size:
+      values: [128, 256]
+  model:
+    backbone:
+      values: [resnet18, wide_resnet50_2]
+```
+
+The observation budget defines the total number of experiments to run. The method is the optimization method to be used. The metric is the metric to be used to evaluate the performance of the model. The parameters are the hyperparameters to be optimized. For details on methods other than `bayes` and parameter values apart from list, refer the [[wandb]] documentation. Everything under the `parameters` key overrides the default values defined in the model configuration. Currently, only the dataset and model parameters are overridden for the HPO search.
+
+## Running HPO
+
+:::{Note}
+You will need to have logged into a wandb account to use HPO search and view the results.
+:::
+
+To run the hyperparameter optimization, use the following command:
+
+```bash
+python tools/hpo/wandb_sweep.py --model padim --model_config ./path_to_config.yaml --sweep_config tools/hpo/sweep.yaml"
+```
+
+In case `model_config` is not provided, the script looks at the default config location for that model.
+
+```bash
+python tools/hpo/wandb_sweep.py --sweep_config tools/hpo/sweep.yaml
+```
+
+
+## Sample Output
+
+![Sample configuration of a wandb sweep](../images/logging/wandb_sweep.png)
+<figcaption>Sample wandb sweep on Padim</figcaption>