You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @jacobgil , I'm working on applying Score-CAM on YOLOv5 by implementing the YOLOBoxScoreTarget class (so this issue is closely related with #242). There are several issues I want to iron out before making a PR (e.g. YOLOv5 returns parseable Detection objects only when the input is not a Torch tensor - see ultralytics/yolov5#6726 - and my workaround is ad-hoc atm), but in my current implementation I find that the resulting heatmaps are inverted, e.g.:
Outputs for dog & cat example, unnormalized (L)/normalized (R):
Also observed similar trends with other images like the 5-dogs example. My YOLOBoxScoreTarget is as follows, it is essentially identical to FasterRCNNBoxScoreTarget, except it leverages parse_detections from the YOLOv5 notebook:
class YOLOBoxScoreTarget:
""" For every original detected bounding box specified in "bounding boxes",
assign a score on how the current bounding boxes match it,
1. In IOU
2. In the classification score.
If there is not a large enough overlap, or the category changed,
assign a score of 0.
The total score is the sum of all the box scores.
"""
def __init__(self, labels, bounding_boxes, iou_threshold=0.5):
self.labels = labels
self.bounding_boxes = bounding_boxes
self.iou_threshold = iou_threshold
def __call__(self, model_outputs):
boxes, colors, categories, names, confidences = parse_detections(model_outputs)
boxes = torch.Tensor(boxes)
output = torch.Tensor([0])
if torch.cuda.is_available():
output = output.cuda()
boxes = boxes.cuda()
if len(boxes) == 0:
return output
for box, label in zip(self.bounding_boxes, self.labels):
box = torch.Tensor(box[None, :])
if torch.cuda.is_available():
box = box.cuda()
ious = torchvision.ops.box_iou(box, boxes)
index = ious.argmax()
if ious[0, index] > self.iou_threshold and categories[index] == label:
score = ious[0, index] + confidences[index]
output = output + score
return output
I also slightly altered get_cam_weights in score_cam.py to make it play nice with numpy inputs instead of torch tensors.
import torch
import tqdm
from .base_cam import BaseCAM
import numpy as np
class ScoreCAM(BaseCAM):
def __init__(
self,
model,
target_layers,
use_cuda=False,
reshape_transform=None):
super(ScoreCAM, self).__init__(model,
target_layers,
use_cuda,
reshape_transform=reshape_transform,
uses_gradients=False)
if len(target_layers) > 0:
print("Warning: You are using ScoreCAM with target layers, "
"however ScoreCAM will ignore them.")
def get_cam_weights(self,
input_tensor,
target_layer,
targets,
activations,
grads):
with torch.no_grad():
upsample = torch.nn.UpsamplingBilinear2d(
size=input_tensor.shape[-2:])
activation_tensor = torch.from_numpy(activations)
if self.cuda:
activation_tensor = activation_tensor.cuda()
upsampled = upsample(activation_tensor)
maxs = upsampled.view(upsampled.size(0),
upsampled.size(1), -1).max(dim=-1)[0]
mins = upsampled.view(upsampled.size(0),
upsampled.size(1), -1).min(dim=-1)[0]
maxs, mins = maxs[:, :, None, None], mins[:, :, None, None]
upsampled = (upsampled - mins) / (maxs - mins)
input_tensors = input_tensor[:, None,
:, :] * upsampled[:, :, None, :, :]
if hasattr(self, "batch_size"):
BATCH_SIZE = self.batch_size
else:
BATCH_SIZE = 16
scores = []
for target, tensor in zip(targets, input_tensors):
for i in tqdm.tqdm(range(0, tensor.size(0), BATCH_SIZE)):
batch = tensor[i: i + BATCH_SIZE, :]
# TODO: current solution to handle the issues with torch inputs, improve
batch = list(batch.numpy())
batch = [np.swapaxes((elt * 255).astype(np.uint8), 0, -1) for elt in batch]
outs = [self.model(b) for b in batch]
outputs = [target(o).cpu().item() for o in outs]
scores.extend(outputs)
scores = torch.Tensor(scores)
scores = scores.view(activations.shape[0], activations.shape[1])
weights = torch.nn.Softmax(dim=-1)(scores).numpy()
return weights
To me the implementations seem correct, and therefore I am not able to address why the resulting heatmaps seem inverted. Any ideas? I'd be happy to make a branch with a reproducible example to debug and potentially extend it to a PR. Please let me know.
The text was updated successfully, but these errors were encountered:
@mahendra-gehlot unfortunately I haven't dug deep into the code to figure out what's the underlying issue, but have noticed something interesting. It seems the heatmaps are correct for yolov5s, but inverted for yolov5n/m/l. Furthermore, this exclusively affects EigenCAM, ScoreCAM does not suffer from this for any model I've tried. Here are the outputs for n/s/m + ScoreCAM (n) for for example:
YOLOv5n
YOLOv5s
YOLOv5m
YOLOv5n (ScoreCAM)
This also relates with @syncsyncsync 's solution, which doesn't really explain why it works for yolov5s and ScoreCAM which uses the same image load/save pipeline. I have no intuition yet into why this is happening, but it may prove a good starting point for debugging (for which I may not find sufficient time in the near future I'm afraid).
Hi @jacobgil , I'm working on applying Score-CAM on YOLOv5 by implementing the
YOLOBoxScoreTarget
class (so this issue is closely related with #242). There are several issues I want to iron out before making a PR (e.g. YOLOv5 returns parseableDetection
objects only when the input is not a Torch tensor - see ultralytics/yolov5#6726 - and my workaround is ad-hoc atm), but in my current implementation I find that the resulting heatmaps are inverted, e.g.:Outputs for dog & cat example, unnormalized (L)/normalized (R):
Also observed similar trends with other images like the 5-dogs example. My
YOLOBoxScoreTarget
is as follows, it is essentially identical to FasterRCNNBoxScoreTarget, except it leveragesparse_detections
from the YOLOv5 notebook:I also slightly altered
get_cam_weights
in score_cam.py to make it play nice with numpy inputs instead of torch tensors.To me the implementations seem correct, and therefore I am not able to address why the resulting heatmaps seem inverted. Any ideas? I'd be happy to make a branch with a reproducible example to debug and potentially extend it to a PR. Please let me know.
The text was updated successfully, but these errors were encountered: