You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm working on an experiment where I noticed large differences between models trained with identical configs and random seeds. I'm trying to understand the causes for this.
However, despite using these flags and the most recent detectron2 sources, the final trained models and their validation accuracies can differ greatly on a custom dataset set of mine (~2 AP).
These differences occur in multiple runs on the same machine (identical device, code, config, random seed).
I've been looking into reproducing this problem and also observe this for the unaltered detectron2 demo training code. I've added a minimal script to reproduce the training and observe rather big differences between the first logged losses of three subsequent runs.
Instructions To Reproduce the Issue:
Full runnable code or full changes you made:
script to reproduce the experiment (deterministic_example.py)
I would expect the losses to be (largely) identical in the default training setup, when using identical machine/code/random seed/config and PyTorch flags for deterministic training.
The text was updated successfully, but these errors were encountered:
I'm still facing the issue. Without having debugged this in more detail and just looking at the losses of the three runs, loss_cls appears to differ the most at the beginning of the training.
There have been other issues that have been closed in the past (e.g. #2480
), pointing to PyTorch's non-determinism. Perhaps revisiting them with the new deterministic training flags in PyTorch could give new pointers.
Hi, I'm working on an experiment where I noticed large differences between models trained with identical configs and random seeds. I'm trying to understand the causes for this.
I've upgraded to a more recent PyTorch version that introduced flags for deterministic training between multiple executions:
https://pytorch.org/docs/1.11/notes/randomness.html?highlight=reproducibility
However, despite using these flags and the most recent detectron2 sources, the final trained models and their validation accuracies can differ greatly on a custom dataset set of mine (~2 AP).
These differences occur in multiple runs on the same machine (identical device, code, config, random seed).
I've been looking into reproducing this problem and also observe this for the unaltered detectron2 demo training code. I've added a minimal script to reproduce the training and observe rather big differences between the first logged losses of three subsequent runs.
Instructions To Reproduce the Issue:
script to reproduce the experiment (
deterministic_example.py
)run2:
run3:
Expected behavior:
I would expect the losses to be (largely) identical in the default training setup, when using identical machine/code/random seed/config and PyTorch flags for deterministic training.
The text was updated successfully, but these errors were encountered: