Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added reduce ddp results on eval #2434

Merged
merged 11 commits into from
Jun 30, 2020
Merged

added reduce ddp results on eval #2434

merged 11 commits into from
Jun 30, 2020

Conversation

williamFalcon
Copy link
Contributor

Enhancement: aggregates ddp outputs on eval

Fixes #702

@mergify mergify bot requested a review from a team June 30, 2020 18:51
@pep8speaks
Copy link

pep8speaks commented Jun 30, 2020

Hello @williamFalcon! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-30 19:19:32 UTC

@@ -347,6 +354,22 @@ def _evaluate(

return eval_results

def reduce_eval_ddp(self, eval_results):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this traversal of nested types looks like what @justusschock did in metrics. maybe one could re-use his apply_to_collection from utilitiles.apply_func.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point @justusschock how would i adapt that here?

Copy link
Member

@awaelchli awaelchli Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick brain-compiled code:

eval_results = apply_to_collection(
    eval_results, 
    dtype=torch.Tensor, 
    function=reduce_eval_ddp, 
    world_size=self.world_size
)

with reduce_eval_ddp defined as

def reduce_eval_ddp(v, world_size):
    dist.all_reduce(v, op=dist.reduce_op.SUM)
    v = v / world_size
    return v

@mergify mergify bot requested a review from a team June 30, 2020 19:12
@codecov
Copy link

codecov bot commented Jun 30, 2020

Codecov Report

Merging #2434 into master will decrease coverage by 0%.
The diff coverage is 29%.

@@          Coverage Diff           @@
##           master   #2434   +/-   ##
======================================
- Coverage      89%     88%   -0%     
======================================
  Files          69      69           
  Lines        5485    5499   +14     
======================================
+ Hits         4858    4862    +4     
- Misses        627     637   +10     

@Borda Borda added the bug Something isn't working label Jun 30, 2020
@williamFalcon williamFalcon merged commit 309ed75 into master Jun 30, 2020
@Borda Borda added this to the 0.8.x milestone Jun 30, 2020
@williamFalcon williamFalcon deleted the val_reduce branch July 5, 2020 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bringing together results on ddp on a single machine
4 participants