Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix gradient requirements for layer methods #647

Closed
wants to merge 5 commits into from

Conversation

vivekmig
Copy link
Contributor

@vivekmig vivekmig commented Apr 5, 2021

This updates gradient requirements to be set on layer inputs / outputs rather than original inputs, which ensures that gradient requirements are set when inputs are non-floating point (e.g. token indices). This also avoids unnecessarily requiring gradients between the input and target layer, when only layer gradients are required.

@vivekmig vivekmig changed the title WIP: Fix gradient requirements for layer methods Fix gradient requirements for layer methods Apr 6, 2021
@facebook-github-bot
Copy link
Contributor

@vivekmig has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@vivekmig vivekmig requested a review from NarineK April 6, 2021 20:54
Copy link
Contributor

@NarineK NarineK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for the fix. Couple nits and questions.

@@ -23,7 +23,9 @@
)


def apply_gradient_requirements(inputs: Tuple[Tensor, ...]) -> List[bool]:
def apply_gradient_requirements(
inputs: Tuple[Tensor, ...], warn: bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: In order to support original behavior don't we want to not warn only when we know that warning is not necessary in case of layer approaches ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, thanks! Yes, you're definitely right, meant to set the default to True

captum/_utils/gradient.py Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@vivekmig has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@vivekmig merged this pull request in d630574.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants