Add generator and worker seed #8602

UnglvKitDe · 2022-07-16T23:26:26Z

Worker seed and generator inserted into the dataloader as described in #8601

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Enhance randomness control in YOLOv5 dataloaders for more consistent training results. 🔄

📊 Key Changes

Added seed_worker function to initialize the random seeds for dataloader workers.
Incorporated setting of manual seed and generator for PyTorch DataLoader.

🎯 Purpose & Impact

🎲 Improve Consistency: Ensure that each worker in a dataloader initializes with a specific seed for better reproducibility.
✨ Enhanced Randomness: The changes allow for more controlled randomness during data loading, which can contribute to more stable training performance across different runs.
👩‍🔬 Research Friendly: Facilitate experimental consistency for researchers and developers, enabling fair comparison of models and training regimes.

for more information, see https://pre-commit.ci

glenn-jocher · 2022-07-19T17:05:52Z

@UnglvKitDe I'm seeing identical reproducible results with master and torch>=1.12.0. It doesn't seem we need any more modifications. i.e. in Colab with python train.py --epochs 3:

glenn-jocher · 2022-07-19T17:06:19Z

@UnglvKitDe are these DDP-specific requirements for reproducibility?

UnglvKitDe · 2022-07-20T00:09:31Z

@glenn-jocher Mh, so I implemented it in my version because it was recommended and I didn't want to take any risks with edge cases. On the other hand, because I once read about a bug like this. But it seems that it has been fixed in the meantime. I could not recreate the case, but it is also still described in tutorials like here that it can come to this problem especially with many workers.

glenn-jocher · 2022-07-22T23:10:23Z

@UnglvKitDe I can verify that I am not seeing reproducible results with DDP trainings, we probably do need this PR merged then.

I'm worried about the same seed on all workers though, I think this may impact augmentation by repeating the same augmentations. Perhaps we should set worker seeds equal to their RANK?

glenn-jocher · 2022-07-22T23:25:29Z

@UnglvKitDe PR is merged. Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐

glenn-jocher · 2022-07-22T23:43:09Z

@UnglvKitDe tested DDP training but I do not see reproducible results after this PR unfortunately. It seems something else is missing.

UnglvKitDe · 2022-07-22T23:57:45Z

@UnglvKitDe I can verify that I am not seeing reproducible results with DDP trainings, we probably do need this PR merged then.

I'm worried about the same seed on all workers though, I think this may impact augmentation by repeating the same augmentations. Perhaps we should set worker seeds equal to their RANK?

@glenn-jocher Each worker has a different worker_seed. Here is a small example:

class RandomDataset(Dataset):
    def __getitem__(self, index):
        return np.random.randint(0, 1, 1)

    def __len__(self):
        return 16

def seed_worker(worker_id):
    # https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
    init_seed = torch.initial_seed()
    worker_seed = init_seed % 2**32
    print(os.getpid(), worker_seed, init_seed)
    np.random.seed(worker_seed)
    random.seed(worker_seed)
dataset = RandomDataset()
dataloader = DataLoader(dataset, batch_size=2, num_workers=4, 
                        worker_init_fn=seed_worker)
for epoch in range(1):
    print(f"epoch: {epoch}")
    for batch in dataloader:
        print(batch)
    print("-"*25)

glenn-jocher · 2022-07-23T17:22:25Z

@UnglvKitDe oh perfect, got it! I tried to implement an additional PR #8688 but still don't get reproducible DDP results even afterwards. Not sure unfortunately.

* Add generator and worker seed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataloaders.py * Update dataloaders.py * Update dataloaders.py Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>

UnglvKitDe and others added 2 commits July 17, 2022 01:24

Add generator and worker seed

284eab6

[pre-commit.ci] auto fixes from pre-commit.com hooks

a01894f

for more information, see https://pre-commit.ci

glenn-jocher linked an issue Jul 17, 2022 that may be closed by this pull request

Reproducibility in multi-process data loading #8601

Closed

2 tasks

glenn-jocher assigned UnglvKitDe Jul 22, 2022

Merge branch 'master' into new_feature/rseed_worker

3bd609b

glenn-jocher added 3 commits July 23, 2022 01:17

Update dataloaders.py

2660752

Update dataloaders.py

5fc1067

Update dataloaders.py

5992a7f

glenn-jocher merged commit 1c5e92a into ultralytics:master Jul 22, 2022

Forever518 mentioned this pull request Sep 24, 2022

generator seed fix for DDP mAP drop #9545

Merged

Hojland mentioned this pull request Oct 17, 2022

feat/bump Go-Autonomous/yolov5#15

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generator and worker seed #8602

Add generator and worker seed #8602

UnglvKitDe commented Jul 16, 2022 •

edited by UltralyticsAssistant

Loading

glenn-jocher commented Jul 19, 2022

glenn-jocher commented Jul 19, 2022

UnglvKitDe commented Jul 20, 2022

glenn-jocher commented Jul 22, 2022

glenn-jocher commented Jul 22, 2022

glenn-jocher commented Jul 22, 2022

UnglvKitDe commented Jul 22, 2022

glenn-jocher commented Jul 23, 2022

Add generator and worker seed #8602

Add generator and worker seed #8602

Conversation

UnglvKitDe commented Jul 16, 2022 • edited by UltralyticsAssistant Loading

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

glenn-jocher commented Jul 19, 2022

glenn-jocher commented Jul 19, 2022

UnglvKitDe commented Jul 20, 2022

glenn-jocher commented Jul 22, 2022

glenn-jocher commented Jul 22, 2022

glenn-jocher commented Jul 22, 2022

UnglvKitDe commented Jul 22, 2022

glenn-jocher commented Jul 23, 2022

UnglvKitDe commented Jul 16, 2022 •

edited by UltralyticsAssistant

Loading