Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaked semaphores with DDP training #1461

Closed
VitorGuizilini opened this issue Apr 12, 2020 · 2 comments · Fixed by #2029
Closed

Leaked semaphores with DDP training #1461

VitorGuizilini opened this issue Apr 12, 2020 · 2 comments · Fixed by #2029
Assignees
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@VitorGuizilini
Copy link
Contributor

I constantly get this warning when training on an AWS instance (8 GPUs, using DDP). It does not crash, but the training hangs for a few seconds before continuing.

/usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 3 leaked semaphores to clean up at shutdown

I can share my docker container if necessary, as it might be an issue with library versions.

@VitorGuizilini VitorGuizilini added bug Something isn't working help wanted Open to be worked on labels Apr 12, 2020
@VitorGuizilini
Copy link
Contributor Author

FYI this behaviour only shows up in the latest master (8544b33), if I install 0.7.3 it disappears.

@tullie
Copy link
Contributor

tullie commented Apr 19, 2020

@vguizilini I haven't been able to reproduce this on latest master while running with 8 GPUS using DDP. Are you still getting the warning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants