New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

update tracin influence API #1072

Closed

99warriors wants to merge 1 commit into pytorch:master from 99warriors:export-D41324297

Contributor

99warriors commented Nov 16, 2022

Summary:
This diff changes the API for implementations of TracInCPBase as discussed in https://fb.quip.com/JbpnAiWluZmI. In particular, the arguments representing test data of the influence method are changed from inputs: Tuple, targets: Optional[Tensor] to inputs: Union[Tuple[Any], DataLoader], which is either a single batch, or a dataloader yielding batches. In both cases, model(*batch) is assumed to produce the predictions for a batch, and batch[-1] is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by train_dataloader.

We make this change for 2 reasons

it unifies the assumptions made of the test data and the assumptions made of the training data
for some implementations, we want to allow the test data to be represented by a dataloader. with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow inputs to be a tuple (and not a dataloader). This is okay due to inheritance rules. Later on, we will allow some implementations (i.e. TracInCP) to accept a dataloader as inputs.

Other changes:

changes to make documentation. for example, documentation in TracInCPBase.influence now refers to the "test dataset" instead of test batch.
the unpack_inputs argument is no longer needed for the influence methods, and is removed
the usage of influence in all the tests is changed to match new API.
signature of helper methods _influence_batch_tracincp and _influence_batch_tracincp_fast are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

facebook-github-bot added cla signed fb-exported labels

Contributor

facebook-github-bot commented Nov 16, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

49f2399

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: 7ac12211941172d86a0aadbb6b5dd41ae9e1b52b

99warriors force-pushed the export-D41324297 branch from 9b14f36 to 49f2399 Compare

November 21, 2022 04:03

Contributor

facebook-github-bot commented Nov 21, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

3b1aa94

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: c5834f74e301b4ccbbc2cc0b9f331455ff04a4b2

99warriors force-pushed the export-D41324297 branch from 49f2399 to 3b1aa94 Compare

December 3, 2022 23:33

Contributor

facebook-github-bot commented Dec 3, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

0d9ebb4

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: 9fe108de1a6789d461c19d71b724cd18bbcffbd9

99warriors force-pushed the export-D41324297 branch from 3b1aa94 to 0d9ebb4 Compare

December 8, 2022 20:12

Contributor

facebook-github-bot commented Dec 8, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors added a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

d5a054c

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Differential Revision: https://internalfb.com/D41324297

fbshipit-source-id: 78a982fbc07a5555c9eae1ef0b4177088cd217fd

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

b452a17

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: 7fb77eb9014adf846ed973a4f8edf82c44127595

99warriors force-pushed the export-D41324297 branch from 0d9ebb4 to b452a17 Compare

December 8, 2022 20:29

Contributor

facebook-github-bot commented Dec 8, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

facd51c

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: 9204b4c5b75f7bff1093ddf562a1cba4dfb83284

99warriors force-pushed the export-D41324297 branch from b452a17 to facd51c Compare

December 9, 2022 16:16

Contributor

facebook-github-bot commented Dec 9, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors added a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

5239de3

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Differential Revision: https://internalfb.com/D41324297

fbshipit-source-id: d962b940452685e7e488986f11c769633b3d3e2d

Contributor

facebook-github-bot commented Dec 19, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          update tracin influence API (pytorch#1072)

24cd933

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: f0098b83a486b49059c02f359f093ed3b791688c

99warriors force-pushed the export-D41324297 branch from facd51c to 24cd933 Compare

December 19, 2022 17:30


          update tracin influence API (pytorch#1072)

3fde650

Summary:
Pull Request resolved: pytorch#1072

This diff changes the API for implementations of `TracInCPBase` as discussed in https://fb.quip.com/JbpnAiWluZmI.  In particular, the arguments representing test data of the `influence` method are changed from `inputs: Tuple, targets: Optional[Tensor]` to `inputs: Union[Tuple[Any], DataLoader]`, which is either a single batch, or a dataloader yielding batches.  In both cases, `model(*batch)` is assumed to produce the predictions for a batch, and `batch[-1]` is assumed to be the labels for a batch. This is the same format assumed of the batches yielded by `train_dataloader`.

We make this change for 2 reasons
- it unifies the assumptions made of the test data and the assumptions made of the training data
- for some implementations, we want to allow the test data to be represented by a dataloader.  with the old API, there was no clean way to allow both a single as well as a dataloader to be passed in, since a batch required 2 arguments, but a dataloader only requires 1.

For now, all implementations only allow `inputs` to be a tuple (and not a dataloader).  This is okay due to inheritance rules.  Later on, we will allow some implementations (i.e. `TracInCP`) to accept a dataloader as `inputs`.

Other changes:
- changes to make documentation.  for example, documentation in `TracInCPBase.influence` now refers to the "test dataset" instead of test batch.
- the `unpack_inputs` argument is no longer needed for the `influence` methods, and is removed
- the usage of `influence` in all the tests is changed to match new API.
- signature of helper methods `_influence_batch_tracincp` and `_influence_batch_tracincp_fast` are changed to match new representation of batches.

Reviewed By: cyrjano

Differential Revision: D41324297

fbshipit-source-id: 827350795bf2e5c6c1fab2e5ef8f2db1473dfe3d

99warriors force-pushed the export-D41324297 branch from 24cd933 to 3fde650 Compare

December 19, 2022 20:04

Contributor

facebook-github-bot commented Dec 19, 2022

This pull request was exported from Phabricator. Differential Revision: D41324297

facebook-github-bot closed this in

fe13596

facebook-github-bot added the Merged label

Contributor

facebook-github-bot commented Dec 19, 2022

This pull request has been merged in fe13596.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment