New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Allow multiple outputs for agg_mode=True in Feature Ablation #425

Closed

miguelmartin75 wants to merge 1 commit into pytorch:master from miguelmartin75:export-D22416476

Contributor

miguelmartin75 commented Jul 9, 2020 •

edited

Loading

Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the num_examples (batch_size) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the batch_size becomes larger.

This allows for an arbitrary sized tensor to be output from the forward_func for feature ablation.

Implementation Details

We assume aggregation_output_mode to be the case if: perturbations_per_eval == 1 and [ feature_mask is None or is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has batch_size outputs) and agg_output_mode=True

If agg_output_mode == True:

Feature ablation will output a tensor of shape 1xOxF where O is the number of output features and F is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in agg_output_mode we must ensure the number of elements is n (batch_size). If it is not, we output an error to the user. Here we could actually check if the element size is at least n, but for simplicity I am not doing this.

Tests

Added tests to check for:

agg_mode=True:

Incorrect feature mask (i.e. where fm.shape[0] > 1)
Output a Fx1 tensor where F is the number of features in the input
The above but for a feature mask with the first two features treated as one feature
Output a 2x3x5 constant tensor (not associated to outputs)
- internally this will be interpreted as a 1x30 2D tensor

agg_mode=False:

Check there is exactly n outputs where n == batch_size => if not then check that we throw an exception (assertion error). This already exists in test_error_perturbations_per_eval_limit_batch_scalar

Notes

I created a new function rather than modifying _find_output_mode_and_verify; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

facebook-github-bot added the fb-exported label

Contributor

facebook-github-bot commented Jul 9, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

7a9fef6

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 786acb543c9249465e132f65713693ad3d89101d

miguelmartin75 force-pushed the export-D22416476 branch from c32c797 to 7a9fef6 Compare

July 10, 2020 20:48

Contributor

facebook-github-bot commented Jul 10, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

7c983fa

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 67ca51aa79de0dee137ac90e3057dc4127a288ad

miguelmartin75 force-pushed the export-D22416476 branch from 7a9fef6 to 7c983fa Compare

July 10, 2020 21:27

Contributor

facebook-github-bot commented Jul 10, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

7fdbbac

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 1b3d41e8096acb0dbdf0f9fd173c3cf46ecbe680

miguelmartin75 force-pushed the export-D22416476 branch from 7c983fa to 7fdbbac Compare

July 10, 2020 21:42

Contributor

facebook-github-bot commented Jul 10, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

070efc8

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 0d08ca990a1e999339e51f0a7fa50be197d2f3b9

miguelmartin75 force-pushed the export-D22416476 branch from 7fdbbac to 070efc8 Compare

July 12, 2020 01:54

Contributor

facebook-github-bot commented Jul 12, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

eae51dc

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: eff7da94323e1e3c01d73ea377902df1bc6a4e76

miguelmartin75 force-pushed the export-D22416476 branch from 070efc8 to eae51dc Compare

July 16, 2020 18:42

Contributor

facebook-github-bot commented Jul 16, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

1cfc5e3

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: 344bc6db17e1bb04570e68ebc20a0a3da7c09c73

miguelmartin75 force-pushed the export-D22416476 branch from eae51dc to 1cfc5e3 Compare

July 21, 2020 21:13

Contributor

facebook-github-bot commented Jul 21, 2020

This pull request was exported from Phabricator. Differential Revision: D22416476

facebook-github-bot closed this in

eb3e758

Contributor

facebook-github-bot commented Jul 22, 2020

This pull request has been merged in eb3e758.

facebook-github-bot added the Merged label

NarineK pushed a commit to NarineK/captum-1 that referenced this pull request


          Allow multiple outputs for agg_mode=True in Feature Ablation (pytorch…

fe9d063

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: d9094754ec31152a0a2199403a8b709b39a92d04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment