refactor feature ablation #1047

aobo-y · 2022-10-14T06:34:39Z

refactor to feature ablation to make the logic about multi-output attribution more readable
removed some (outdated) misleading comments
add more comments to explain the steps and tensor shapes

facebook-github-bot · 2022-10-14T17:33:21Z

@aobo-y has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

aobo-y · 2022-10-14T17:53:54Z

captum/attr/_core/feature_ablation.py

                torch.zeros(
-                    (num_outputs,) + input.shape[1:],
+                    (n_outputs,) + input.shape[1:],


should we respect the output shape, instead of flatten it, like (*initial_eval.shape, *input.shape[1:])?

For example, if the forward return in shape(5, 2), the attribution should be in shape(5,2, ...), not shape(10, ...)

cc @vivekmig @NarineK

I think this functionality is nice if used for multiple per-example outputs, but might cause some confusion since the output attributions shape may no longer always be aligned with the input shape. Particularly, some models have outputs with some dimension sizes = 1, for example with input 6 x 4 and output 6 x 1 (or even 6 x 1 x 1), this would currently have 6 x 4 attribution rather than 6 x 1 x 4 or 6 x 1 x 1 x 4 with this change. Maybe we can squeeze all dimensions of 1 and then utilize the output shape?

the output attributions shape may no longer always be aligned with the input shape

do you mean the number of dimension would no longer be the same between input and returned attribution? E.g., if the input is (batch_size, A, B) and the forward outputs (C, D), the attr will be(C*D, A, B). The attr does also have 3 dimensions like the input, but the shape may not be the same (where C*D != batch_size).

My suggestion tries to respect the output of the forward more. The dimension (or the entire shape) of the returned attr depend on the input and output shapes of the forward. E.g., if the input is in (batch_size, A, B) and the forward outputs (C, D), the attribution will be in (C, D, A, B). I feel if the user make the forward returns (C, D), most likely the user also need to reshape our returned C*D to (C, D) to further aggregate or select the attribution, e.g., C is batch_size, D is n_tasks.

I can see the reasons behind both designs. But I would suggest to not squeeze if we use the 2nd way. Coz we do not understand the exact meaning of the dims of 1. I am afraid some weird edge cases that keeping the 1 can be very convenient (users want to reuse the selection indices created for the forward output). I would trust the users to know their own model output 😆 . If they feel the dims of 1 are really annoying, they can always control it by wrapping their own model forward = lambda *args, **kargs: forward(*args, **kargs).squeeze()

aobo-y · 2022-10-14T17:59:52Z

captum/attr/_core/feature_ablation.py

+
+            # flattent eval outputs into 1D (n_outputs)
+            # add the leading dim for n_feature_perturbed
+            if isinstance(initial_eval, Tensor):


is it common for the forward not return a Tensor? What else do we expect? I would guess float or double, but they are all very easy to be converted into tensor and users can just do it by wrapping the forward. So we can explicitly define the return type of forward.

Even if we really want to support non-Tensor returns, can we convert them in _run_forward to standardize the logic?

Yes, I think this is fairly common, the model could return float / double / int, and there are use-cases depending on this. It would be best to maintain support for this use-case, although moving the logic to wrap as a Tensor to _run_forward should be fine. This would require an extra single-element tensor construction, but would avoid branching here, I'm fine either way.

vivekmig

Looks great, this looks cleaner, thanks for updating this!

vivekmig · 2022-10-17T17:07:50Z

captum/attr/_core/feature_ablation.py

+
+            # flattent eval outputs into 1D (n_outputs)
+            # add the leading dim for n_feature_perturbed
+            if isinstance(initial_eval, Tensor):


Yes, I think this is fairly common, the model could return float / double / int, and there are use-cases depending on this. It would be best to maintain support for this use-case, although moving the logic to wrap as a Tensor to _run_forward should be fine. This would require an extra single-element tensor construction, but would avoid branching here, I'm fine either way.

vivekmig · 2022-10-17T20:04:05Z

captum/attr/_core/feature_ablation.py

                torch.zeros(
-                    (num_outputs,) + input.shape[1:],
+                    (n_outputs,) + input.shape[1:],


I think this functionality is nice if used for multiple per-example outputs, but might cause some confusion since the output attributions shape may no longer always be aligned with the input shape. Particularly, some models have outputs with some dimension sizes = 1, for example with input 6 x 4 and output 6 x 1 (or even 6 x 1 x 1), this would currently have 6 x 4 attribution rather than 6 x 1 x 4 or 6 x 1 x 1 x 4 with this change. Maybe we can squeeze all dimensions of 1 and then utilize the output shape?

vivekmig · 2022-10-17T20:23:16Z

captum/attr/_core/feature_ablation.py

+            # number of elements in the output of forward_func
+            n_outputs = initial_eval.numel() if isinstance(initial_eval, Tensor) else 1
+
+            # flattent eval outputs into 1D (n_outputs)


nit: flatten

facebook-github-bot · 2022-10-18T19:13:57Z

@aobo-y has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: FeatureAblation(Permutation) never clearly define the type of `forward`'s return. According to the [documentation](https://captum.ai/api/feature_ablation.html), `tensor` is the only acceptable type > The forward function can either return **a scalar per example** or **a tensor of a fixed sized tensor (or scalar value)** for the full batch However, returning a single `int` or `float` is a common use case we have already supported (ref #1047 (comment)). But our code did not explicitly raise error for unexpected types. Other types like `list`, `tuple`, or `numpy.array` may pass unexpectedly or fail in unexpected places with confusing error messages, such as we may use `list` as `torch.dtype` https://github.com/pytorch/captum/blob/5f878af6a74a5d041dc478addabfaf365d7810d5/captum/attr/_core/feature_ablation.py#L320 The PR explicitly assert the return type and convert everything into `tensor`. The assertion & conversion is done in a new private `_run_forward` wrapper instead of `_run_forward` from global utils, which is shared by many other classes. I will update others progressively and eventually push the logic to the shared `_run_forward`. Pull Request resolved: #1049 Reviewed By: diegoolano, vivekmig Differential Revision: D40577410 Pulled By: aobo-y fbshipit-source-id: 796b8510c1998ae339d4cb32b0e62b930fd38be1

Summary: Correctly validate the `forward_fun`'s output shape in `FeatureAblation` (and `FeaturePermutation`) Abandoned previous flawed assumption of "aggregation mode", which forbid the support for multi-outputs models (ref #1047) New logic does not care output shape when `perturbations_per_eval == 1`. Only when `perturbations_per_eval > 1`, it require "Non-Aggregation mode", which is defined as the 1st dim of the model's output should grow with the input's batch size in the same ratio. This is achieved by actually comparing the output shape of 2 different inputs instead of making any assumption based on other user config: - The baseline output is from the initial eval with the original inputs which we have to run anyway. - The expanded output is from the 1st ablated eval whose input batch size has been expanded for more feature perturbation. This way does not even introduce any extra forward calls. Pull Request resolved: #1091 Reviewed By: vivekmig Differential Revision: D42027843 Pulled By: aobo-y fbshipit-source-id: cbcbad64bb1695e7be9c9447ebde6ec3f0cb8a90

refactor feature ablation

f226306

facebook-github-bot added the cla signed label Oct 14, 2022

fix mypy

081c81f

aobo-y force-pushed the master branch from e08efd6 to 081c81f Compare October 14, 2022 07:38

ufmt

fb45bb7

aobo-y commented Oct 14, 2022

View reviewed changes

aobo-y requested review from vivekmig and NarineK October 14, 2022 18:19

vivekmig approved these changes Oct 17, 2022

View reviewed changes

typo

965071e

facebook-github-bot closed this in 5f878af Oct 20, 2022

aobo-y mentioned this pull request Oct 21, 2022

convert forward return to tensor in FeatureAblation #1049

Closed

aobo-y mentioned this pull request Dec 14, 2022

validate forward_fun output shape in FeatureAblation #1091

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor feature ablation #1047

refactor feature ablation #1047

aobo-y commented Oct 14, 2022 •

edited

Loading

facebook-github-bot commented Oct 14, 2022

aobo-y Oct 14, 2022

vivekmig Oct 17, 2022

aobo-y Oct 18, 2022 •

edited

Loading

aobo-y Oct 14, 2022

vivekmig Oct 17, 2022

vivekmig left a comment

vivekmig Oct 17, 2022

vivekmig Oct 17, 2022

vivekmig Oct 17, 2022

facebook-github-bot commented Oct 18, 2022

refactor feature ablation #1047

refactor feature ablation #1047

Conversation

aobo-y commented Oct 14, 2022 • edited Loading

facebook-github-bot commented Oct 14, 2022

aobo-y Oct 14, 2022

Choose a reason for hiding this comment

vivekmig Oct 17, 2022

Choose a reason for hiding this comment

aobo-y Oct 18, 2022 • edited Loading

Choose a reason for hiding this comment

aobo-y Oct 14, 2022

Choose a reason for hiding this comment

vivekmig Oct 17, 2022

Choose a reason for hiding this comment

vivekmig left a comment

Choose a reason for hiding this comment

vivekmig Oct 17, 2022

Choose a reason for hiding this comment

vivekmig Oct 17, 2022

Choose a reason for hiding this comment

vivekmig Oct 17, 2022

Choose a reason for hiding this comment

facebook-github-bot commented Oct 18, 2022

aobo-y commented Oct 14, 2022 •

edited

Loading

aobo-y Oct 18, 2022 •

edited

Loading