Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attention Map Generation #18

Closed
Hugh0120 opened this issue Jan 18, 2021 · 14 comments
Closed

Attention Map Generation #18

Hugh0120 opened this issue Jan 18, 2021 · 14 comments

Comments

@Hugh0120
Copy link

Thanks for the release of pretrained model!
I was wondering if it is possible to show attention maps of input images using released ViT-32 model?

@jongwook
Copy link
Collaborator

It is not very convenient, but it should be possible this way:

  • Modify the call to nn.MultiheadAttention in model.py to pass need_weights=True.
  • Use the module's second return value, which will contain the attention weights. (The current code discards the second return value, by appending [0] to the call)
  • Load the CLIP model with jit=False applied, in order to load the modified model implementation.

It could have been more elegant with forward hooks, but PyTorch does not currently allow modifying the keyword arguments of forward_pre_hooks.

Another caveat is that the multihead attention implementation only returns the attention weights summed over all heads, so you might want to delete the .sum(dim=1) / num_heads part in case you want to show per-head visualization.

@betterze
Copy link

@Hugh0120 @jongwook Is there an implementation about this? I am new to transformer, a simple demo will be very helpful. Thx in advance.

@haofanwang
Copy link
Contributor

Hi, @betterze and @Hugh0120, I implement the attention map based on suggestions. Here is my implementation which may be helpful to you guys.

@betterze
Copy link

@haofanwang Thank you for shaing with us this nice implementation. It is very nice.

@g-luo
Copy link

g-luo commented Apr 22, 2021

Does anyone have any insight on which layers are best to use as the saliency layer for Grad CAM (for the visual transformer ViT models)? For the resnet people use the relu from the last visual layer, and I'm wondering what's the best for the visual transformer.

EDIT: realizing that GradCAM is not meant for transformers and mainly CNN / ResNet based models. Feel free to ignore the above question.

@kaizhaol
Copy link

@g-luo I don't think gradCAM on the activation is a good idea. The patch setting makes it too coarse to be useful. But in general, when dealing with ViT, people usually use the output of the last norm layer of the last transformer block. So far I have mixed results though with CLIP.

@Maddy12
Copy link

Maddy12 commented Jun 21, 2022

Does anyone have an implementation on the visual side using CLIP with a ViT?

@ricardodeazambuja
Copy link

Since this is the top result from google, it may be easier for me to find my fork here than by using the search box on github 😆
I followed the suggestions from the comments above and modified the code so I can extract the weights and visualize the attention for images:
https://github.com/ricardodeazambuja/CLIP/blob/attn_weights/notebooks/Interacting_with_CLIP.ipynb

@zhanjiahui
Copy link

zhanjiahui commented Feb 24, 2023

Since this is the top result from google, it may be easier for me to find my fork here than by using the search box on github 😆 I followed the suggestions from the comments above and modified the code so I can extract the weights and visualize the attention for images: https://github.com/ricardodeazambuja/CLIP/blob/attn_weights/notebooks/Interacting_with_CLIP.ipynb

Hello @ricardodeazambuja , thank you for your recommendation. ,

However, I run this file and this error occurs. Do you know how to solve it?
图片
图片

@ricardodeazambuja
Copy link

@zhanjiahui, are you using the attn_weights branch from my repository?

@zhanjiahui
Copy link

@zhanjiahui, are you using the attn_weights branch from my repository?
@ricardodeazambuja
Thank you for your reply. Yes, I downloaded CLIP-attn-weights from your repository.😭

@ricardodeazambuja
Copy link

@zhanjiahui, Here everything works just fine (I had a problem related to a change in behaviour on how pillow and numpy exchange stuff). I would bet when you do import clip you are importing the normal clip instead of the one from the branch attn_weights because the notebooks are not at the base directory, so they won't see the directory clip by default.

@zhanjiahui
Copy link

zhanjiahui commented Feb 24, 2023

@zhanjiahui, Here everything works just fine (I had a problem related to a change in behaviour on how pillow and numpy exchange stuff). I would bet when you do import clip you are importing the normal clip instead of the one from the branch attn_weights because the notebooks are not at the base directory, so they won't see the directory clip by default.

​Hi @ricardodeazambuja , I uninstalled normal clip, and moved this ipynb file into the root dir of your project. But this error still happens. I double-checked that the encode_image function has exactly one return value in your repository. This is so confusing to me.

图片

@ricardodeazambuja
Copy link

ricardodeazambuja commented Feb 24, 2023

@zhanjiahui, Here everything works just fine (I had a problem related to a change in behaviour on how pillow and numpy exchange stuff). I would bet when you do import clip you are importing the normal clip instead of the one from the branch attn_weights because the notebooks are not at the base directory, so they won't see the directory clip by default.

​Hi @ricardodeazambuja , I uninstalled normal clip, and moved this ipynb file into the root dir of your project. But this error still happens. I double-checked that the encode_image function has exactly one return value in your repository. This is so confusing to me.

图片

@zhanjiahui, keep following the flow of the code and you will find it here:
https://github.com/ricardodeazambuja/CLIP/blob/9340bb39d05162605ed38b5999f68e9b7d390e72/clip/model.py#L230

Try to pull the latest version of the repo, checkout the attn_weights branch and install using pip install -e ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants