Internal Batching Improvements #333

vivekmig · 2020-03-24T22:48:59Z

This PR improves internal batching to avoid creating an expanded tensor initially and simply recursively compute each subset of steps and add the resulting attributions.

This adds the extra condition that internal_batch_size must be at least equal to the number of examples, and the documentation has been updated appropriately.

…eInternalBatching

NarineK · 2020-03-25T21:17:07Z

captum/attr/_core/integrated_gradients.py

@@ -53,6 +53,7 @@ def __init__(self, forward_func: Callable) -> None:
                          modification of it
        """
        GradientAttribution.__init__(self, forward_func)
+        self.predefined_step_size_alphas = None  # used for internal batching


I think this can be dangerous if we call attribute multiple times for different inputs for different n_steps. Does it have to be an instance variable ?

This attribute is only modified in _batched_attribution for recursive sub-calls and always reset to None after each operation, so it shouldn't affect any separate attribution calls. The other option is to add another argument to attribute that is only used for this, but it could be confusing to users looking at the method signature, so I preferred this approach.

I think that if this gets executed in a multi-threaded environment there can be some issues...

One way to avoid it could be to move the core logic of ig into an auxiliary function such as: _attribute that takes, in addition, start and end kwargs. _batched_attribution can call _attribute instead of attribute ?
https://github.com/pytorch/captum/pull/333/files#diff-67ff19e8dcc2510648379db71c2ee9fbR275

Sounds good, I can switch to that, it would avoid adding a user-facing parameter which would be good! Thanks for the suggestion!

I think calling attribute from different threads on the same attribution object isn't broadly supported by our methods, since we have other cases where we store temp info such as hook lists as an object attribute, which would have the same issue. But I agree that approach is cleaner, will update to it, thanks!

Thank you :) yes, that's a good point. Ideally we need to make those hook holders local variables as well. IG is in a more sophisticated shape and more likely to be used than others. We can keep this in mind and make necessary adjustments in other approaches as well.

…eInternalBatching

vivekmig · 2020-03-31T16:20:46Z

.circleci/config.yml

@@ -43,7 +43,7 @@ commands:
    steps:
      - run:
          name: "Check import order with isort"
-          command: isort --check-only
+          command: isort --check-only -v


Switching this to verbose to have more details on incorrect orderings in CircleCI

NarineK

Looks good! Thank you! Left some comments.
Instead of asserts for the internal_batch_size it might be good to default it and show warning messages.

captum/attr/_utils/batching.py

NarineK · 2020-04-01T00:54:00Z

captum/attr/_utils/batching.py

+    step_count = internal_batch_size // num_examples
+    assert (
+        step_count > 0
+    ), "Internal batch size must be at least equal to the number of input examples."


Wouldn't it be better to to do: step_count = max(1, internal_batch_size // num_examples) and show a warning to the user when internal_batch_size is too small ? This might be a better user experience ?

NarineK · 2020-04-01T01:00:45Z

captum/attr/_utils/batching.py

+    if include_endpoint:
+        assert (
+            step_count > 1
+        ), "Internal batch size must be at least twice the number of input examples."


Perhaps explaining why here would be better. Also, perhaps defaulting like the previous suggestion would be better

Does this mean that if the n_steps=1 then this will always fail because internal_batch_size can only be 1 in that case ?
Perhaps making a recommendation message would be helpful.

Yes, that will fail (and probably should), since it would've also failed in the implementation (LayerConductance) where this argument is used, because we take the difference between consecutive steps to estimate gradient, and wouldn't be able to do this with 1 step.

…eInternalBatching

vivekmig · 2020-04-01T17:15:02Z

captum/insights/config.py

@@ -1,5 +1,5 @@
 #!/usr/bin/env python3
-from typing import List, NamedTuple, Optional, Tuple, Dict
+from typing import Dict, List, NamedTuple, Optional, Tuple


Fixing isort failure on master

facebook-github-bot

@vivekmig has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-04-01T20:16:44Z

@vivekmig merged this pull request in 455483a.

Summary: This PR improves internal batching to avoid creating an expanded tensor initially and simply recursively compute each subset of steps and add the resulting attributions. This adds the extra condition that internal_batch_size must be at least equal to the number of examples, and the documentation has been updated appropriately. Pull Request resolved: pytorch#333 Reviewed By: NarineK Differential Revision: D20796123 Pulled By: vivekmig fbshipit-source-id: 78931a86b0e092ec0b257793aa6e20aadc081947

vivekmig added 2 commits March 24, 2020 15:24

Fixes

e7a72de

Minor fixes

62d1b3c

vivekmig requested a review from NarineK March 24, 2020 22:49

Merge branch 'master' of https://github.com/pytorch/captum into Updat…

4cd1d29

…eInternalBatching

NarineK reviewed Mar 25, 2020

View reviewed changes

vivekmig added 5 commits March 25, 2020 15:18

Fix batch sizes in test

def3ffc

Merge branch 'master' of https://github.com/pytorch/captum into Updat…

b9cb85c

…eInternalBatching

Fixes

a89e44e

Fix type hint

211cca7

Fixing isort

9b2bf56

vivekmig commented Mar 31, 2020

View reviewed changes

Fix import ordering

4301f9e

NarineK approved these changes Apr 1, 2020

View reviewed changes

vivekmig added 3 commits April 1, 2020 09:54

Merge branch 'master' of https://github.com/pytorch/captum into Updat…

3c8c1a4

…eInternalBatching

Fixes

fb6bab1

Fixing spacing

f6becb2

vivekmig commented Apr 1, 2020

View reviewed changes

facebook-github-bot reviewed Apr 1, 2020

View reviewed changes

facebook-github-bot closed this in 455483a Apr 1, 2020

facebook-github-bot added the Merged label Apr 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal Batching Improvements #333

Internal Batching Improvements #333

vivekmig commented Mar 24, 2020

NarineK Mar 25, 2020 •

edited

Loading

vivekmig Mar 25, 2020

NarineK Mar 30, 2020 •

edited

Loading

vivekmig Mar 30, 2020

NarineK Mar 31, 2020

vivekmig Mar 31, 2020

NarineK left a comment

NarineK Apr 1, 2020

NarineK Apr 1, 2020

NarineK Apr 1, 2020

vivekmig Apr 1, 2020

vivekmig Apr 1, 2020

facebook-github-bot left a comment

facebook-github-bot commented Apr 1, 2020

Internal Batching Improvements #333

Internal Batching Improvements #333

Conversation

vivekmig commented Mar 24, 2020

NarineK Mar 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NarineK Mar 30, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NarineK left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 1, 2020

NarineK Mar 25, 2020 •

edited

Loading

NarineK Mar 30, 2020 •

edited

Loading