Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set device to root gpu in single GPU case #3042

Merged
merged 1 commit into from
Aug 18, 2020
Merged

Conversation

ananyahjha93
Copy link
Contributor

Fixes #3030

are there suggested tests for this PR?

Copy link
Contributor

@nateraw nateraw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 😄

If possible, mind adding quick test for this? Approving so we don't block this bugfix either way.

@ananyahjha93
Copy link
Contributor Author

@awaelchli weren't the random failing tests fixed?

@awaelchli
Copy link
Contributor

@ananyahjha93 I think a simple test would be to check
torch.cuda.current_device() == self.root_gpu
for the relevant backends, if I understand correctly.

@awaelchli
Copy link
Contributor

@ananyahjha93 I thought so. Can we rerun the CI to check if it's the same accuracy value. I want to see if it's deterministic.
One thing is suspicious is that in the test we set os.environ['CUDA_VISIBLE_DEVICES'] = '0,1', and this PR is setting the device also.

@ananyahjha93
Copy link
Contributor Author

@awaelchli should pass the GPU tests, the failure was ddp_spawn test on the accuracy metric.

@awaelchli
Copy link
Contributor

the test passes now in a second run, which means despite setting the deterministic flag in the Trainer and seed_everything(1234), something in that test is still random.

@codecov
Copy link

codecov bot commented Aug 18, 2020

Codecov Report

Merging #3042 into master will decrease coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #3042   +/-   ##
======================================
- Coverage      90%     90%   -0%     
======================================
  Files          81      81           
  Lines        7672    7685   +13     
======================================
+ Hits         6906    6912    +6     
- Misses        766     773    +7     

@williamFalcon williamFalcon merged commit 9445c80 into master Aug 18, 2020
@ananyahjha93 ananyahjha93 deleted the bugfix/default_cuda branch August 18, 2020 23:35
@Borda Borda added the bug Something isn't working label Aug 20, 2020
@Borda Borda added this to the 0.9.0 milestone Aug 20, 2020
@Borda
Copy link
Member

Borda commented Aug 21, 2020

still no test, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect default cuda device when using single gpu other than cuda:0
5 participants