Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility layer between stable datasets and prototype transforms #6663

Merged
merged 39 commits into from
Feb 10, 2023

Conversation

pmeier
Copy link
Contributor

@pmeier pmeier commented Sep 28, 2022

This is the proof of concept implementation for #6662. Let's keep the discussion there unless there is need for discussion on a technical issue of the proposal.

cc @vfdev-5 @bjuncek

Conflicts:
	torchvision/prototype/features/__init__.py
	torchvision/prototype/features/_feature.py
torchvision/prototype/features/__init__.py Outdated Show resolved Hide resolved
torchvision/prototype/features/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/features/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/features/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/features/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/features/_feature.py Outdated Show resolved Hide resolved
@pmeier
Copy link
Contributor Author

pmeier commented Jan 31, 2023

dbfac05 is a refactor of the internals. Here are the main points:

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Philip, I haven't looked at the individual wrappers in depth, but the overall design looks great (I appreciate the simplifications!). I left some comments / Qs below

torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
@pmeier pmeier marked this pull request as ready for review February 9, 2023 14:14
Copy link
Contributor Author

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commits add the following changes:

  • Following introduce heuristic for simple tensor handling of transforms v2 #7170, we no longer need special handling for any non-image parts of the samples, since all our datasets fit the heuristic.

  • Following remove categories metadata from (OneHot)Label datapoint #7171 (comment) we no longer use the datapoints.Label class. Together with the pass-through for PIL images, this means that this wrapper is a no-op for classification datasets. This also resolves all discussions regarding the categories, since they are no longer present.

  • Instead of having a wrapper that takes the dataset as well as the sample as inputs, i.e.

    def wrapper(dataset, sample):
        ...
        return wrapped_sample

    the architecture was changed to a factory pattern:

    def wrapper_factory(dataset):
        def wrapper(sample):
            ...
            return wrapped_sample
    
        return wrapper

    In addition to making the wrapper more "natural", this also enables us to raise on unsupported behavior on wrapping rather than when the first sample is drawn. That should improve UX.

  • I've added automated smoke tests to our datasets v1 tests to make sure wrapping and drawing samples doesn't raise anything.

test/test_datasets.py Show resolved Hide resolved
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Philip, LGTM. I just have minor comments (that we discussed offline and are probably already addressed) + one minor Q about wrap_target_by_type.

I'll admit I haven't taken a super deep look to the individual wrappers. We may have to do a bit of manual testing for those later.

test/datasets_utils.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
torchvision/prototype/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved
@pmeier pmeier linked an issue Feb 10, 2023 that may be closed by this pull request
@pmeier pmeier merged commit a9d2572 into pytorch:main Feb 10, 2023
@pmeier pmeier deleted the dataset-wrappers branch February 10, 2023 14:32
@github-actions
Copy link

Hey @pmeier!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@NicolasHug NicolasHug changed the title [PoC] compatibility layer between stable datasets and prototype transforms Compatibility layer between stable datasets and prototype transforms Feb 10, 2023
facebook-github-bot pushed a commit that referenced this pull request Mar 28, 2023
…ype transforms (#6663)

Reviewed By: vmoens

Differential Revision: D44416279

fbshipit-source-id: a3c1ba2048917c5af3005beef6cec77896ab20f8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compatibility layer between stable datasets and prototype transforms?
3 participants