Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in *.attribute -- New tensor in gathering ignoring used device #316

Closed
kai-tub opened this issue Mar 7, 2020 · 3 comments
Closed

Bug in *.attribute -- New tensor in gathering ignoring used device #316

kai-tub opened this issue Mar 7, 2020 · 3 comments

Comments

@kai-tub
Copy link
Contributor

kai-tub commented Mar 7, 2020

Hi, first of all, thanks for providing and working on such a neat library!
I think I found a "bug" in the common.py file, which is used for most attributions. The problem arises when a Cuda device is used and a non-tensor target is given, for example, a simple list.

I don't know if you officially support CUDA or not, but as I couldn't find any hints indicating otherwise, I've used the Saliency function without any problems on a CUDA device, while I was using a tensor as a target. After some refactoring, I changed the target to a simple list and the error occurred. I traced it down to the following line:
return torch.gather(output, 1, torch.tensor(target).reshape(len(output), 1))

As it can be seen a tensor is created but no device information is used. A simple fix would be to look where the output tensor lives:

device = "cuda" if output.is_cuda else "cpu"
return torch.gather(output, 1, torch.tensor(target).reshape(len(output), 1).to(device))
# EDIT alternative:
device = output.device
return torch.gather(output, 1, torch.tensor(target, device=device).reshape(len(output), 1))

(I don't know if this could cause problems if the data lies on a different GPU, as I have no experience with multiple GPUs)

The tensor version works, as it can be moved before on the user side and only reshapes the used one:
return torch.gather(output, 1, target.reshape(len(output), 1))

I've included a minimal example highlighting the problem:

# minimal_example.py
import torch
import torchvision
import numpy as np
from captum.attr import Saliency

# device = "cpu" works!
device = "cuda"
model = torchvision.models.alexnet(pretrained=False)
model.to(device)
model.eval()
X = torch.rand(3, 3, 224, 224, dtype=torch.float)
X = X.to(device)
y_pred = model(X)
saliency = Saliency(model)
X.requires_grad = True
grads = saliency.attribute(X, target=[0, 1])
# working derivations:
# grads = saliency.attribute(X, target=torch.tensor([0, 1, 2]).to(device))
grads.shape
# The reason is the gathering command WITHOUT regarding the current device.
@kai-tub kai-tub changed the title Bug in *.attribute -- Gathering ignoring used device Bug in *.attribute -- New tensor in gathering ignoring used device Mar 7, 2020
@vivekmig
Copy link
Contributor

vivekmig commented Mar 7, 2020

Hi @kai-tub , thanks for the detailed information, you are right, this is definitely a bug! Your proposed fix with maintaining the output device looks great, would you like to create a PR making that change?

@kai-tub
Copy link
Contributor Author

kai-tub commented Mar 7, 2020

Yes I would like to create a PR. :)
I will take a closer look at it tomorrow.

facebook-github-bot pushed a commit that referenced this issue Mar 10, 2020
Summary:
Hey,
Here is my proposed fix for [#316.
I've added a test case that uses the BaseGPUTest class and simply runs the saliency target tests again, with everything on the GPU. As I did not fully understand the `_target_batch_test_assert` function and what the requirements for the tests are, I simply copied the function for now. I assume there may be an obvious way to integrate these tests. :)

Another question is:
Do we want to enforce the move if we receive a target tensor, which is not on the right device?
We could print a User Warning similar to automatically setting require gradients as in gradient.py:33.
Or should the user be responsible to move the target tensor to the correct device?

Best regards,
Kai
Pull Request resolved: #317

Reviewed By: NarineK

Differential Revision: D20371529

Pulled By: vivekmig

fbshipit-source-id: 92461d358f589bf47487d6170fac6a54e1d83123
@NarineK
Copy link
Contributor

NarineK commented Mar 12, 2020

Closing since the PR is merged! Thank you!

@NarineK NarineK closed this as completed Mar 12, 2020
NarineK pushed a commit to NarineK/captum-1 that referenced this issue Nov 19, 2020
Summary:
Hey,
Here is my proposed fix for [pytorch#316.
I've added a test case that uses the BaseGPUTest class and simply runs the saliency target tests again, with everything on the GPU. As I did not fully understand the `_target_batch_test_assert` function and what the requirements for the tests are, I simply copied the function for now. I assume there may be an obvious way to integrate these tests. :)

Another question is:
Do we want to enforce the move if we receive a target tensor, which is not on the right device?
We could print a User Warning similar to automatically setting require gradients as in gradient.py:33.
Or should the user be responsible to move the target tensor to the correct device?

Best regards,
Kai
Pull Request resolved: pytorch#317

Reviewed By: NarineK

Differential Revision: D20371529

Pulled By: vivekmig

fbshipit-source-id: 92461d358f589bf47487d6170fac6a54e1d83123
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants