enforce pickleability for v2 transforms and wrapped datasets #7860

pmeier · 2023-08-21T12:19:31Z

Fixes #6753 (comment).

Fortunately, all our transforms v2 were already pickleable. Thus, this PR just adds tests to enforce this in the future.
Due to the dynamic type of the wrapped dataset (see add support for instance checks on dataset wrappers #7239), we need to "help" pickle a little for it to be able to deserialize the object. We only need to overwrite the __reduce__ method. Again, we also add tests to enforce pickleability in the future.

cc @vfdev-5

pytorch-bot · 2023-08-21T12:19:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7860

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 29 New Failures, 1 Unrelated Failure

As of commit d256c3f with merge base 92882b6 ():

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

windows (windows.g5.4xlarge.nvidia.gpu, cuda, 11.8) / windows-job (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/datasets_utils.py

torchvision/datapoints/_dataset_wrapper.py

test/datasets_utils.py

NicolasHug

Thanks Philip, some comments below but LGTM if green.

Perhaps we'll want to add more tests w.r.t. multiprocessing_context="spawn"

pmeier · 2023-08-22T08:40:04Z

With c68a6de

$ pytest --durations=25 test/test_datasets.py -k v2
[...]
========================================== slowest 25 durations ==========================================
16.05s call     test/test_datasets.py::VOCDetectionTestCase::test_transforms_v2_wrapper
8.05s call     test/test_datasets.py::CityScapesTestCase::test_transforms_v2_wrapper
8.02s call     test/test_datasets.py::CelebATestCase::test_transforms_v2_wrapper
5.38s call     test/test_datasets.py::CocoDetectionTestCase::test_transforms_v2_wrapper
5.23s call     test/test_datasets.py::KittiTestCase::test_transforms_v2_wrapper
3.01s call     test/test_datasets.py::SBDatasetTestCase::test_transforms_v2_wrapper
2.86s call     test/test_datasets.py::HMDB51TestCase::test_transforms_v2_wrapper
2.84s call     test/test_datasets.py::Caltech101TestCase::test_transforms_v2_wrapper
2.82s call     test/test_datasets.py::KineticsTestCase::test_transforms_v2_wrapper
2.82s call     test/test_datasets.py::UCF101TestCase::test_transforms_v2_wrapper
2.74s call     test/test_datasets.py::CIFAR10TestCase::test_transforms_v2_wrapper
2.73s call     test/test_datasets.py::CIFAR100::test_transforms_v2_wrapper
2.70s call     test/test_datasets.py::ImageNetTestCase::test_transforms_v2_wrapper
2.70s call     test/test_datasets.py::VOCSegmentationTestCase::test_transforms_v2_wrapper
2.69s call     test/test_datasets.py::KMNISTTestCase::test_transforms_v2_wrapper
2.68s call     test/test_datasets.py::ImageFolderTestCase::test_transforms_v2_wrapper
2.67s call     test/test_datasets.py::MNISTTestCase::test_transforms_v2_wrapper
2.66s call     test/test_datasets.py::DatasetFolderTestCase::test_transforms_v2_wrapper
2.66s call     test/test_datasets.py::FashionMNISTTestCase::test_transforms_v2_wrapper
2.66s call     test/test_datasets.py::EMNISTTestCase::test_transforms_v2_wrapper
2.64s call     test/test_datasets.py::OxfordIIITPetTestCase::test_transforms_v2_wrapper
2.62s call     test/test_datasets.py::QMNISTTestCase::test_transforms_v2_wrapper
0.07s call     test/test_datasets.py::WIDERFaceTestCase::test_transforms_v2_wrapper
0.07s call     test/test_datasets.py::Caltech256TestCase::test_transforms_v2_wrapper
0.03s setup    test/test_datasets.py::Caltech101TestCase::test_transforms_v2_wrapper
======================================== short test summary info =========================================
FAILED test/test_datasets.py::Caltech256TestCase::test_transforms_v2_wrapper - FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
FAILED test/test_datasets.py::WIDERFaceTestCase::test_transforms_v2_wrapper - ValueError: bad value(s) in fds_to_keep
================== 2 failed, 22 passed, 1 skipped, 479 deselected in 101.67s (0:01:41) ===================

On main and without the DataLoader test

$ pytest test/test_datasets.py -k v2
[...]
====================== 68 passed, 191 skipped, 479 deselected, 5 warnings in 8.36s =======================

Meaning, we are adding roughly 1.5 minutes of test time.

pmeier · 2023-08-22T09:00:51Z

After 5358620

$ pytest --durations=25 test/test_datasets.py -k v2
[...]
========================================== slowest 25 durations ==========================================
4.61s call     test/test_datasets.py::CityScapesTestCase::test_transforms_v2_wrapper
4.57s call     test/test_datasets.py::CelebATestCase::test_transforms_v2_wrapper
4.48s call     test/test_datasets.py::VOCDetectionTestCase::test_transforms_v2_wrapper
1.63s call     test/test_datasets.py::HMDB51TestCase::test_transforms_v2_wrapper
1.58s call     test/test_datasets.py::UCF101TestCase::test_transforms_v2_wrapper
1.57s call     test/test_datasets.py::SBDatasetTestCase::test_transforms_v2_wrapper
1.56s call     test/test_datasets.py::Caltech101TestCase::test_transforms_v2_wrapper
1.51s call     test/test_datasets.py::KineticsTestCase::test_transforms_v2_wrapper
1.51s call     test/test_datasets.py::QMNISTTestCase::test_transforms_v2_wrapper
1.49s call     test/test_datasets.py::VOCSegmentationTestCase::test_transforms_v2_wrapper
1.48s call     test/test_datasets.py::CocoDetectionTestCase::test_transforms_v2_wrapper
1.48s call     test/test_datasets.py::ImageNetTestCase::test_transforms_v2_wrapper
1.47s call     test/test_datasets.py::ImageFolderTestCase::test_transforms_v2_wrapper
1.47s call     test/test_datasets.py::DatasetFolderTestCase::test_transforms_v2_wrapper
1.47s call     test/test_datasets.py::CIFAR10TestCase::test_transforms_v2_wrapper
1.47s call     test/test_datasets.py::OxfordIIITPetTestCase::test_transforms_v2_wrapper
1.46s call     test/test_datasets.py::KMNISTTestCase::test_transforms_v2_wrapper
1.45s call     test/test_datasets.py::MNISTTestCase::test_transforms_v2_wrapper
1.43s call     test/test_datasets.py::FashionMNISTTestCase::test_transforms_v2_wrapper
1.43s call     test/test_datasets.py::EMNISTTestCase::test_transforms_v2_wrapper
1.43s call     test/test_datasets.py::CIFAR100::test_transforms_v2_wrapper
1.42s call     test/test_datasets.py::KittiTestCase::test_transforms_v2_wrapper
0.02s setup    test/test_datasets.py::Caltech101TestCase::test_transforms_v2_wrapper
0.01s call     test/test_datasets.py::WIDERFaceTestCase::test_transforms_v2_wrapper

(1 durations < 0.005s hidden.  Use -vv to show these durations.)
======================================== short test summary info =========================================
FAILED test/test_datasets.py::Caltech256TestCase::test_transforms_v2_wrapper - FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp731zruyt/caltech256/256_ObjectCatego...
FAILED test/test_datasets.py::WIDERFaceTestCase::test_transforms_v2_wrapper - ValueError: bad value(s) in fds_to_keep
================== 2 failed, 22 passed, 1 skipped, 479 deselected, 3 warnings in 44.24s ==================

Meaning, we are down to roughly 30 seconds of extra time.

pmeier · 2023-08-22T09:19:23Z

After f339e6c while pretending I'm on macOS

$ pytest test/test_datasets.py -k v2
[...]
========================================== slowest 25 durations ==========================================
4.43s call     test/test_datasets.py::CityScapesTestCase::test_transforms_v2_wrapper
4.42s call     test/test_datasets.py::CelebATestCase::test_transforms_v2_wrapper
4.39s call     test/test_datasets.py::VOCDetectionTestCase::test_transforms_v2_wrapper
1.71s call     test/test_datasets.py::SBDatasetTestCase::test_transforms_v2_wrapper
1.57s call     test/test_datasets.py::KineticsTestCase::test_transforms_v2_wrapper
1.49s call     test/test_datasets.py::Caltech101TestCase::test_transforms_v2_wrapper
1.48s call     test/test_datasets.py::VOCSegmentationTestCase::test_transforms_v2_wrapper
1.47s call     test/test_datasets.py::CocoDetectionTestCase::test_transforms_v2_wrapper
1.47s call     test/test_datasets.py::KittiTestCase::test_transforms_v2_wrapper
1.43s call     test/test_datasets.py::OxfordIIITPetTestCase::test_transforms_v2_wrapper
1.42s call     test/test_datasets.py::ImageNetTestCase::test_transforms_v2_wrapper
0.03s setup    test/test_datasets.py::Caltech101TestCase::test_transforms_v2_wrapper
0.02s call     test/test_datasets.py::WIDERFaceTestCase::test_transforms_v2_wrapper

(12 durations < 0.005s hidden.  Use -vv to show these durations.)
======================================== short test summary info =========================================
FAILED test/test_datasets.py::WIDERFaceTestCase::test_transforms_v2_wrapper - ValueError: bad value(s) in fds_to_keep
================== 1 failed, 11 passed, 1 skipped, 479 deselected, 1 warning in 27.57s ===================

We are down to ~20 seconds of extra runtime compared to main. Note that we still have a failure that will add a few seconds as well when fixed. Plus, the benchmark was done on my machine, which is more powerful than our macOS CI runners. So the extra time will likely be higher in CI.

pmeier · 2023-08-22T12:31:07Z

test/datasets_utils.py

@@ -548,7 +549,7 @@ def test_feature_types(self, config):
    @test_all_configs
    def test_num_examples(self, config):
        with self.create_dataset(config) as (dataset, info):
-            assert len(dataset) == info["num_examples"]
+            assert len(list(dataset)) == len(dataset) == info["num_examples"]


We never actually consumed the dataset before. Thus, any failures that happen not for the first sample are not detected. Fortunately, only one test was broken that I'll flag below.

pmeier · 2023-08-22T12:35:36Z

test/test_datasets.py


 class Caltech256TestCase(datasets_utils.ImageDatasetTestCase):
    DATASET_CLASS = datasets.Caltech256

    def inject_fake_data(self, tmpdir, config):
        tmpdir = pathlib.Path(tmpdir) / "caltech256" / "256_ObjectCategories"

-        categories = ((1, "ak47"), (127, "laptop-101"), (257, "clutter"))
+        categories = ((1, "ak47"), (2, "american-flag"), (3, "backpack"))


datasets.Caltech relies on the fact that all categories are present. When actually consuming the dataset (see above), the old fake data setup falls flat. Our options are:

Fix the dataset to account for gaps in the catgories.

Create all categories as fakedata

Create fakedata that starts at the first without any gaps, but not all categories.

Option 3. is by far the least amount of work, so I went for that here.

pmeier · 2023-08-22T12:37:10Z

torchvision/datasets/widerface.py

-                                    "occlusion": labels_tensor[:, 7],
-                                    "pose": labels_tensor[:, 8],
-                                    "invalid": labels_tensor[:, 9],
+                                    "bbox": labels_tensor[:, 0:4].clone(),  # x, y, width, height


Views on tensor cannot be pickled correctly. Meaning regardless of the v2 wrapper, datasets.Widerface has never worked in a spawn context.

NicolasHug · 2023-08-22T16:51:14Z

Still LGTM, but the test duration is a bit unfortunate.

I think we'd be better-off testing just 2-3 of these datasets (or even just 1 TBH, as one test is still infinitely better than zero). Also ideally we'd let them having transforms, so that what we're testing is closer to a real-world use-case.

…#7860) Summary: (Note: this ignores all push blocking failures!) Reviewed By: matteobettini Differential Revision: D48900370 fbshipit-source-id: be1b23dcab58d2a8b5bca7190f94c0123263d036

enforce pickleability for v2 transforms and wrapped datasets

60110cb

pmeier added enhancement module: transforms labels Aug 21, 2023

pmeier requested a review from NicolasHug August 21, 2023 12:19

pmeier commented Aug 21, 2023

View reviewed changes

test/datasets_utils.py Outdated Show resolved Hide resolved

torchvision/datapoints/_dataset_wrapper.py Outdated Show resolved Hide resolved

NicolasHug reviewed Aug 21, 2023

View reviewed changes

test/datasets_utils.py Outdated Show resolved Hide resolved

NicolasHug approved these changes Aug 21, 2023

View reviewed changes

facebook-github-bot added the cla signed label Aug 21, 2023

use DataLoader for testing on select configs

c68a6de

pmeier added 3 commits August 22, 2023 10:44

cleanup

d228bdb

cleanup

1efe583

streamline v2 check

5358620

pmeier added 2 commits August 22, 2023 11:12

run DataLoader test only on macOS

af66bd0

only run v2 checks once per group

f339e6c

pmeier added 2 commits August 22, 2023 13:47

reinstate old test

1e35ee7

fix broken tests

edc043e

pmeier commented Aug 22, 2023

View reviewed changes

Merge branch 'main' into pickle

d256c3f

pmeier temporarily deployed to pytorchbot-env August 24, 2023 08:01 — with GitHub Actions Inactive

pmeier merged commit 054432d into pytorch:main Aug 24, 2023
32 of 62 checks passed

pmeier temporarily deployed to pytorchbot-env August 24, 2023 08:01 — with GitHub Actions Inactive

pmeier deleted the pickle branch August 24, 2023 08:02

pmeier temporarily deployed to pytorchbot-env August 24, 2023 08:02 — with GitHub Actions Inactive

pmeier temporarily deployed to pytorchbot-env August 24, 2023 08:03 — with GitHub Actions Inactive

RickLuiken mentioned this pull request Oct 25, 2023

Transforms are lost when using a dataloader with spawn #8066

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enforce pickleability for v2 transforms and wrapped datasets #7860

enforce pickleability for v2 transforms and wrapped datasets #7860

pmeier commented Aug 21, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 21, 2023 •

edited

Loading

NicolasHug left a comment

pmeier commented Aug 22, 2023

pmeier commented Aug 22, 2023 •

edited

Loading

pmeier commented Aug 22, 2023

pmeier Aug 22, 2023

pmeier Aug 22, 2023

pmeier Aug 22, 2023

NicolasHug commented Aug 22, 2023

enforce pickleability for v2 transforms and wrapped datasets #7860

enforce pickleability for v2 transforms and wrapped datasets #7860

Conversation

pmeier commented Aug 21, 2023 • edited by pytorch-bot bot Loading

pytorch-bot bot commented Aug 21, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7860

❌ 29 New Failures, 1 Unrelated Failure

NicolasHug left a comment

Choose a reason for hiding this comment

pmeier commented Aug 22, 2023

pmeier commented Aug 22, 2023 • edited Loading

pmeier commented Aug 22, 2023

pmeier Aug 22, 2023

Choose a reason for hiding this comment

pmeier Aug 22, 2023

Choose a reason for hiding this comment

pmeier Aug 22, 2023

Choose a reason for hiding this comment

NicolasHug commented Aug 22, 2023

pmeier commented Aug 21, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 21, 2023 •

edited

Loading

pmeier commented Aug 22, 2023 •

edited

Loading