Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple outputs for agg_mode=True in Feature Ablation #425

Closed

Conversation

miguelmartin75
Copy link
Contributor

@miguelmartin75 miguelmartin75 commented Jul 9, 2020

Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the num_examples (batch_size) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the batch_size becomes larger.

This allows for an arbitrary sized tensor to be output from the forward_func for feature ablation.


Implementation Details

We assume aggregation_output_mode to be the case if: perturbations_per_eval == 1 and [ feature_mask is None or is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has batch_size outputs) and agg_output_mode=True

If agg_output_mode == True:

  • Feature ablation will output a tensor of shape 1xOxF where O is the number of output features and F is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in agg_output_mode we must ensure the number of elements is n (batch_size). If it is not, we output an error to the user. Here we could actually check if the element size is at least n, but for simplicity I am not doing this.

Tests

Added tests to check for:

agg_mode=True:

  • Incorrect feature mask (i.e. where fm.shape[0] > 1)
  • Output a Fx1 tensor where F is the number of features in the input
  • The above but for a feature mask with the first two features treated as one feature
  • Output a 2x3x5 constant tensor (not associated to outputs)
    • internally this will be interpreted as a 1x30 2D tensor

agg_mode=False:

  • Check there is exactly n outputs where n == batch_size => if not then check that we throw an exception (assertion error). This already exists in test_error_perturbations_per_eval_limit_batch_scalar

Notes

I created a new function rather than modifying _find_output_mode_and_verify; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 10, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 786acb543c9249465e132f65713693ad3d89101d
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 10, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 67ca51aa79de0dee137ac90e3057dc4127a288ad
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 10, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 1b3d41e8096acb0dbdf0f9fd173c3cf46ecbe680
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 12, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Differential Revision: D22416476

fbshipit-source-id: 0d08ca990a1e999339e51f0a7fa50be197d2f3b9
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

miguelmartin75 added a commit to miguelmartin75/captum that referenced this pull request Jul 16, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: eff7da94323e1e3c01d73ea377902df1bc6a4e76
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: 344bc6db17e1bb04570e68ebc20a0a3da7c09c73
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D22416476

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in eb3e758.

NarineK pushed a commit to NarineK/captum-1 that referenced this pull request Nov 19, 2020
…#425)

Summary:
Pull Request resolved: pytorch#425

## Description

What is aggregation output mode? It can be defined as:

When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger.

This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation.

 ---
## Implementation Details

We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True`

If `agg_output_mode == True`:
- Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).

If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this.

## Tests

Added tests to check for:

`agg_mode=True`:
- Incorrect feature mask (i.e. where `fm.shape[0] > 1`)
- Output a `Fx1` tensor where `F` is the number of features in the input
- The above but for a feature mask with the first two features treated as one feature
- Output a `2x3x5` constant tensor (not associated to outputs)
   - internally this will be interpreted as a `1x30` 2D tensor

`agg_mode=False`:
- Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`**

## Notes

I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.

Reviewed By: vivekmig

Differential Revision: D22416476

fbshipit-source-id: d9094754ec31152a0a2199403a8b709b39a92d04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants