-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]Feature map visualization #293
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
05cc4b4
WIP: vis
HIT-cwh 7b8307e
WIP: add visualization
HIT-cwh 07695a6
WIP: add visualization hook
acf656f
WIP: support razor visualizer
HIT-cwh 4974fc7
WIP
HIT-cwh 9b2dfd4
WIP: wrap draw_featmap
HIT-cwh b7d226d
support feature map visualization
HIT-cwh fa2c31c
add a demo image for visualization
HIT-cwh 59d8119
fix typos
HIT-cwh 914e87d
Merge branch 'dev-1.x' into vis
HIT-cwh 50fe8af
change eps to 1e-6
HIT-cwh bd9cdcc
add pytest for visualization
HIT-cwh 8a57a91
fix vis hook
HIT-cwh b219dbb
fix arguments' name
HIT-cwh 418ba19
Merge branch 'dev-1.x' into vis
HIT-cwh b9f5904
fix img path
HIT-cwh 6b9d482
support draw inference results
HIT-cwh 4c71c6a
add visualization doc
HIT-cwh bb913ed
fix figure url
HIT-cwh 88f4d28
move files
HIT-cwh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
21 changes: 21 additions & 0 deletions
21
configs/distill/mmdet/cwd/cwd_fpn_retina_r101_retina_r50_1x_coco_visualization.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
_base_ = ['./cwd_fpn_retina_r101_retina_r50_1x_coco.py'] | ||
|
||
default_hooks = dict( | ||
checkpoint=dict(type='CheckpointHook', interval=-1), | ||
visualization=dict( | ||
_scope_='mmrazor', | ||
type='RazorVisualizationHook', | ||
enabled=True, | ||
recorders=dict( | ||
# todo: Maybe it is hard for users to understand why to add a | ||
# prefix `architecture.` | ||
neck=dict( | ||
_scope_='mmrazor', | ||
type='ModuleOutputs', | ||
source='architecture.neck')), | ||
mappings=dict( | ||
p3=dict(recorder='neck', data_idx=0), | ||
p4=dict(recorder='neck', data_idx=1), | ||
p5=dict(recorder='neck', data_idx=2), | ||
p6=dict(recorder='neck', data_idx=3)), | ||
out_dir='retina_vis')) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
## 可视化 | ||
|
||
## 特征图可视化 | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/41630003/197720299-774b202c-fecc-414b-9f31-499092caee18.jpg" width="1000" alt="image"/> | ||
</div> | ||
可视化可以给深度学习的模型训练和测试过程提供直观解释。 | ||
|
||
MMRazor 中,将使用 MMEngine 提供的 `Visualizer` 可视化器搭配 MMRazor 自带的 `Recorder`组件的数据记录功能,进行特征图可视化,其具备如下功能: | ||
|
||
- 支持基础绘图接口以及特征图可视化。 | ||
- 支持选择模型中的任意位点来得到特征图,包含 `pixel_wise_max` ,`squeeze_mean` , `select_max` , `topk` 四种显示方式,用户还可以使用 `arrangement` 自定义特征图显示的布局方式。 | ||
|
||
## 特征图绘制 | ||
|
||
你可以调用 `tools/visualizations/vis_configs/feature_visualization.py` 来简单快捷地得到单张图片单个模型的可视化结果。 | ||
|
||
为了方便理解,将其主要参数的功能梳理如下: | ||
|
||
- `img`:选择要用于特征图可视化的图片,支持单张图片或者图片路径列表。 | ||
|
||
- `config`:选择算法的配置文件。 | ||
|
||
- `vis_config`:可视化功能需借助可配置的 `Recorder` 组件获取模型中用户自定义位点的特征图, | ||
用户可以将 `Recorder` 相关配置文件放入 `vis_config` 中。 MMRazor提供了对backbone及neck | ||
输出进行可视化对应的config文件,详见 `configs/visualizations` | ||
|
||
- `checkpoint`:选择对应算法的权重文件。 | ||
|
||
- `--out-file`:将得到的特征图保存到本地,并指定路径和文件名。若没有选定,则会直接显示特征图。 | ||
|
||
- `--device`:指定用于推理图片的硬件,`--device cuda:0` 表示使用第 1 张 GPU 推理,`--device cpu` 表示用 CPU 推理。 | ||
|
||
- `--repo`:模型对应的算法库。`--repo mmdet` 表示模型为检测模型。 | ||
|
||
- `--use-norm`:是否将获取的特征图进行batch normalization后再显示。 | ||
|
||
- `--overlaid`:是否将特征图覆盖在原图之上。若设为True,考虑到输入的特征图通常非常小,函数默认将特征图进行上采样后方便进行可视化。 | ||
|
||
- `--channel-reduction`:输入的 Tensor 一般是包括多个通道的,`channel_reduction` 参数可以将多个通道压缩为单通道,然后和图片进行叠加显示,有以下三个参数可以设置: | ||
|
||
- `pixel_wise_max`:将输入的 C 维度采用 max 函数压缩为一个通道,输出维度变为 (1, H, W)。 | ||
- `squeeze_mean`:将输入的 C 维度采用 mean 函数压缩为一个通道,输出维度变成 (1, H, W)。 | ||
- `select_max`:从输入的 C 维度中先在空间维度 sum,维度变成 (C, ),然后选择值最大的通道。 | ||
- `None`:表示不需要压缩,此时可以通过 `topk` 参数可选择激活度最高的 `topk` 个特征图显示。 | ||
|
||
- `--topk`:只有在 `channel_reduction` 参数为 `None` 的情况下, `topk` 参数才会生效,其会按照激活度排序选择 `topk` 个通道,然后和图片进行叠加显示,并且此时会通过 `--arrangement` 参数指定显示的布局,该参数表示为一个数组,两个数字需要以空格分开,例如: `--topk 5 --arrangement 2 3` 表示以 `2行 3列` 显示激活度排序最高的 5 张特征图, `--topk 7 --arrangement 3 3` 表示以 `3行 3列` 显示激活度排序最高的 7 张特征图。 | ||
|
||
- 如果 topk 不是 -1,则会按照激活度排序选择 topk 个通道显示。 | ||
- 如果 topk = -1,此时通道 C 必须是 1 或者 3 表示输入数据是图片,否则报错提示用户应该设置 `channel_reduction` 来压缩通道。 | ||
|
||
- `--arrangement`:特征图的排布。当 `channel_reduction` 不是None且topk > 0时才会有用。 | ||
|
||
- `--resize-shape`:当`--overlaid`为True时,是否需要将原图和特征图resize为某一尺寸。 | ||
|
||
- `--cfg-options`:由于不同算法库的visualizer拥有特例化的add_datasample方法,如mmdet的visualizer | ||
拥有 `pred_score_thr` 作为输入参数,可以在`--cfg-options`加入一些特例化的设置。 | ||
|
||
类似的,用户可以通过调用 `tools/visualizations/vis_configs/feature_diff_visualization.py` 来得到 | ||
单张图片两个模型的特征差异可视化结果,用法与上述类似,差异为: | ||
|
||
- `config1` / `config2`:选择算法1/2的配置文件。 | ||
- `checkpoint1` / `checkpoint2`:选择对应算法1/2的权重文件。 | ||
|
||
## 用法示例 | ||
|
||
以预训练好的 RetinaNet-r101 与 RetinaNet-r50 模型为例: | ||
|
||
请提前下载 RetinaNet-r101 与 RetinaNet-r50 模型权重到本仓库根路径下: | ||
|
||
```shell | ||
cd mmrazor | ||
wget https://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_fpn_2x_coco/retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth | ||
wget https://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_2x_coco/retinanet_r50_fpn_2x_coco_20200131-fdb43119.pth | ||
``` | ||
|
||
(1) 将多通道特征图采用 `pixel_wise_max` 参数压缩为单通道并显示, 通过提取 `neck` 层输出进行特征图可视化(这里只显示了前4个stage的特征图): | ||
|
||
```shell | ||
python tools/visualizations/feature_visualization.py \ | ||
tools/visualizations/demo.jpg \ | ||
PATH/TO/THE/CONFIG \ | ||
tools/visualizations/vis_configs/fpn_feature_visualization.py \ | ||
retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth \ | ||
--repo mmdet --use-norm --overlaid | ||
--channel-reduction pixel_wise_max | ||
``` | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/41630003/197720372-08e29a02-21ce-46a4-910a-97aabe7ec796.jpg" width="800" alt="image"/> | ||
</div> | ||
|
||
(2) 将多通道特征图采用 `select_max` 参数压缩为单通道并显示, 通过提取 `neck` 层输出进行特征图可视化(这里只显示了前4个stage的特征图): | ||
|
||
```shell | ||
python tools/visualizations/feature_visualization.py \ | ||
tools/visualizations/demo.jpg \ | ||
PATH/TO/THE/CONFIG \ | ||
tools/visualizations/vis_configs/fpn_feature_visualization.py \ | ||
retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth \ | ||
--repo mmdet --overlaid | ||
--channel-reduction select_max | ||
``` | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/41630003/197720581-0ed2fd5a-e07d-4320-90e7-adbe0f05fd41.jpg" width="800" alt="image"/> | ||
</div> | ||
|
||
(3) 将多通道特征图采用 `squeeze_mean` 参数压缩为单通道并显示, 通过提取 `neck` 层输出进行特征图可视化(这里只显示了前4个stage的特征图): | ||
|
||
```shell | ||
python tools/visualizations/feature_visualization.py \ | ||
tools/visualizations/demo.jpg \ | ||
PATH/TO/THE/CONFIG \ | ||
tools/visualizations/vis_configs/fpn_feature_visualization.py \ | ||
retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth \ | ||
--repo mmdet --overlaid | ||
--channel-reduction squeeze_mean | ||
``` | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/41630003/197720659-b25fcbcf-c5c5-45a6-965d-5336d136acde.jpg" width="800" alt="image"/> | ||
</div> | ||
|
||
(4) 将多通道特征图采用 `squeeze_mean` 参数压缩为单通道并显示, 通过提取 `neck` 层输出进行特征图可视化(这里只显示了前4个stage的特征图): | ||
|
||
```shell | ||
python tools/visualizations/feature_visualization.py \ | ||
tools/visualizations/demo.jpg \ | ||
PATH/TO/THE/CONFIG \ | ||
tools/visualizations/vis_configs/fpn_feature_visualization.py \ | ||
retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth \ | ||
--repo mmdet --overlaid | ||
--channel-reduction squeeze_mean | ||
``` | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/41630003/197720735-f30299be-2c42-444b-bec6-759723ad43fa.jpg" width="800" alt="image"/> | ||
</div> | ||
|
||
(5) 将多通道的两个模型的特征图差异采用 `pixel_wise_max` 参数压缩为单通道并显示, 这里只显示了前4个stage的特征图差异: | ||
|
||
```shell | ||
python tools/visualizations/feature_diff_visualization.py \ | ||
tools/visualizations/demo.jpg \ | ||
PATH/TO/THE/CONFIG1 \ | ||
PATH/TO/THE/CONFIG2 \ | ||
tools/visualizations/vis_configs/fpn_feature_diff_visualization.py.py \ | ||
retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth \ | ||
retinanet_r50_fpn_2x_coco_20200131-fdb43119.pth \ | ||
--repo mmdet --use-norm --overlaid | ||
--channel-reduction pixel_wise_max | ||
``` | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/41630003/197720804-be0f3a27-e4d7-4160-b518-33d527114f9f.jpg" width="800" alt="image"/> | ||
</div> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
# Copyright (c) OpenMMLab. All rights reserved. | ||
from .dump_subnet_hook import DumpSubnetHook | ||
from .estimate_resources_hook import EstimateResourcesHook | ||
from .visualization_hook import RazorVisualizationHook | ||
|
||
__all__ = ['DumpSubnetHook', 'EstimateResourcesHook'] | ||
__all__ = ['DumpSubnetHook', 'EstimateResourcesHook', 'RazorVisualizationHook'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,205 @@ | ||
# Copyright (c) OpenMMLab. All rights reserved. | ||
import os.path as osp | ||
import warnings | ||
from typing import List, Optional, Union | ||
|
||
import mmcv | ||
import torch | ||
from mmcv.transforms import Compose | ||
from mmengine.dist import master_only | ||
from mmengine.fileio import FileClient | ||
from mmengine.hooks import Hook | ||
from mmengine.model import is_model_wrapper | ||
from mmengine.utils import mkdir_or_exist | ||
from mmengine.visualization import Visualizer | ||
|
||
from mmrazor.models.task_modules import RecorderManager | ||
from mmrazor.registry import HOOKS | ||
from mmrazor.visualization.local_visualizer import modify | ||
|
||
|
||
def norm(feat): | ||
assert len(feat.shape) == 4 | ||
N, C, H, W = feat.shape | ||
feat = feat.permute(1, 0, 2, 3).reshape(C, -1) | ||
mean = feat.mean(dim=-1, keepdim=True) | ||
std = feat.std(dim=-1, keepdim=True) | ||
centered = (feat - mean) / (std + 1e-6) | ||
centered = centered.reshape(C, N, H, W).permute(1, 0, 2, 3) | ||
return centered | ||
|
||
|
||
@HOOKS.register_module() | ||
class RazorVisualizationHook(Hook): | ||
"""Razor Visualization Hook. Used to visualize training process immediate | ||
feature maps. | ||
|
||
1. If ``show`` is True, it means that only the immediate feature maps are | ||
visualized without storing data, so ``vis_backends`` needs to | ||
be excluded. | ||
2. If ``out_dir`` is specified, it means that the immediate feature maps | ||
need to be saved to ``out_dir``. In order to avoid vis_backends | ||
also storing data, so ``vis_backends`` needs to be excluded. | ||
3. ``vis_backends`` takes effect if the user does not specify ``show`` | ||
and `out_dir``. You can set ``vis_backends`` to WandbVisBackend or | ||
TensorboardVisBackend to store the immediate feature maps in Wandb or | ||
Tensorboard. | ||
|
||
Args: | ||
recorders (dict): All recorders' config. | ||
mappings: (Dict[str, Dict]): The mapping between feature names and | ||
records. | ||
enabled (bool): Whether to draw immediate feature maps. If it is False, | ||
it means that no drawing will be done. Defaults to False. | ||
interval (int): The interval of visualization. Defaults to 1. | ||
show (bool): Whether to display the drawn image. Default to False. | ||
wait_time (float): The interval of show (s). Defaults to 0. | ||
out_dir (str, optional): directory where painted images | ||
will be saved in testing process. | ||
file_client_args (dict): Arguments to instantiate a FileClient. | ||
See :class:`mmengine.fileio.FileClient` for details. | ||
Defaults to ``dict(backend='disk')``. | ||
is_overlaid (bool): If `is_overlaid` is True, the final output image | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 'is_overlaid' may cause some misunderstanding, 'mix_with_img' may be better? |
||
will be the weighted sum of img and featmap. Defaults to True. | ||
visualization_cfg (dict): Configs for visualization. | ||
use_norm (bool): Whether to apply Batch Normalization over the | ||
feature map. Defaults to False. | ||
""" | ||
|
||
def __init__(self, | ||
recorders: dict, | ||
mappings: dict, | ||
enabled: bool = False, | ||
data_idx: Union[int, List] = 0, | ||
interval: int = 1, | ||
show: bool = False, | ||
wait_time: float = 0.1, | ||
out_dir: Optional[str] = None, | ||
file_client_args: dict = dict(backend='disk'), | ||
is_overlaid: bool = True, | ||
visualization_cfg=dict( | ||
channel_reduction='pixel_wise_max', | ||
topk=20, | ||
arrangement=(4, 5), | ||
resize_shape=None, | ||
alpha=0.5), | ||
use_norm: bool = False): | ||
self.enabled = enabled | ||
self._visualizer: Visualizer = Visualizer.get_current_instance() | ||
self._visualizer.draw_featmap = modify | ||
if isinstance(data_idx, int): | ||
data_idx = [data_idx] | ||
self.data_idx = data_idx | ||
self.show = show | ||
if self.show: | ||
# No need to think about vis backends. | ||
self._visualizer._vis_backends = {} | ||
warnings.warn('The show is True, it means that only ' | ||
'the prediction results are visualized ' | ||
'without storing data, so vis_backends ' | ||
'needs to be excluded.') | ||
|
||
self.wait_time = wait_time | ||
self.file_client_args = file_client_args.copy() | ||
self.file_client = None | ||
self.out_dir = out_dir | ||
self.interval = interval | ||
|
||
self.is_overlaid = is_overlaid | ||
self.visualization_cfg = visualization_cfg | ||
self.use_norm = use_norm | ||
|
||
self.recorder_manager = RecorderManager(recorders) | ||
self.mappings = mappings | ||
|
||
self._step = 0 # Global step value to record | ||
|
||
@master_only | ||
def before_run(self, runner) -> None: | ||
model = runner.model | ||
if is_model_wrapper(model): | ||
self.recorder_manager.initialize(model.module) | ||
else: | ||
self.recorder_manager.initialize(model) | ||
|
||
@master_only | ||
def before_train(self, runner): | ||
if not self.enabled or runner.epoch % self.interval != 0: | ||
return | ||
self._visualize(runner, 'before_run') | ||
|
||
@master_only | ||
def after_train_epoch(self, runner) -> None: | ||
if not self.enabled or runner.epoch % self.interval != 0: | ||
return | ||
self._visualize(runner, f'epoch_{runner.epoch}') | ||
|
||
def _visualize(self, runner, stage): | ||
if self.out_dir is not None: | ||
self.out_dir = osp.join(runner.work_dir, runner.timestamp, | ||
self.out_dir) | ||
mkdir_or_exist(self.out_dir) | ||
|
||
if self.file_client is None: | ||
self.file_client = FileClient(**self.file_client_args) | ||
|
||
cfg = runner.cfg.copy() | ||
test_pipeline = cfg.test_dataloader.dataset.pipeline | ||
new_test_pipeline = [] | ||
for pipeline in test_pipeline: | ||
if pipeline['type'] != 'LoadAnnotations' and pipeline[ | ||
'type'] != 'LoadPanopticAnnotations': | ||
new_test_pipeline.append(pipeline) | ||
|
||
test_pipeline = Compose(new_test_pipeline) | ||
dataset = runner.val_loop.dataloader.dataset | ||
|
||
for idx in self.data_idx: | ||
data_info = dataset.get_data_info(idx) | ||
img_path = data_info['img_path'] | ||
data_ = dict(img_path=img_path, img_id=0) | ||
data_ = test_pipeline(data_) | ||
|
||
data_['inputs'] = [data_['inputs']] | ||
data_['data_samples'] = [data_['data_samples']] | ||
|
||
with torch.no_grad(), self.recorder_manager: | ||
runner.model.test_step(data_) | ||
|
||
if self.is_overlaid: | ||
img_bytes = self.file_client.get(img_path) | ||
overlaid_image = mmcv.imfrombytes( | ||
img_bytes, channel_order='rgb') | ||
else: | ||
overlaid_image = None | ||
|
||
for name, record in self.mappings.items(): | ||
recorder = self.recorder_manager.get_recorder(record.recorder) | ||
record_idx = getattr(record, 'record_idx', 0) | ||
data_idx = getattr(record, 'data_idx', None) | ||
feats = recorder.get_record_data(record_idx, data_idx) | ||
if isinstance(feats, torch.Tensor): | ||
feats = (feats, ) | ||
|
||
for i, feat in enumerate(feats): | ||
if self.use_norm: | ||
feat = norm(feat) | ||
drawn_img = self._visualizer.draw_featmap( | ||
feat[0], overlaid_image, **self.visualization_cfg) | ||
|
||
out_file = None | ||
if self.out_dir is not None: | ||
out_file = f'{stage}_data_idx_{idx}_{name}_{i}.jpg' | ||
out_file = osp.join(self.out_dir, out_file) | ||
|
||
self._visualizer.add_datasample( | ||
f'{stage}_data_idx_{idx}_{name}_{i}', | ||
drawn_img, | ||
draw_gt=False, | ||
draw_pred=False, | ||
show=self.show, | ||
wait_time=0.1, | ||
# TODO: Supported in mmengine's Viusalizer. | ||
out_file=out_file, | ||
step=self._step) | ||
self._step += 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Copyright (c) OpenMMLab. All rights reserved. | ||
from .local_visualizer import modify | ||
|
||
__all__ = ['modify'] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the bash command of this config for getting started in readme will be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to add a new user guide named
visualize feature maps
in the future?