Missing training_step outputs in training_epoch_end #2327

mmiakashs · 2020-06-23T04:57:50Z

Possible bug fix of #2320

…ss all the batch outputs to training_epoch_end(if user defined this method)

Borda

mind add a test for this case? probably some simple example from #2320

codecov · 2020-06-23T05:19:25Z

Codecov Report

Merging #2327 into master will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #2327   +/-   ##
======================================
  Coverage      88%     88%           
======================================
  Files          70      70           
  Lines        5501    5503    +2     
======================================
+ Hits         4834    4836    +2     
  Misses        667     667

williamFalcon · 2020-06-23T15:17:55Z

@mmiakashs mind trying master now? the solution in this PR wasn't quite 100% right and needed more testing.

This PR is likely not needed anymore but we need to add you as co-author to #2328 @Borda

mergify · 2020-06-23T15:18:42Z

This pull request is now in conflict... :(

Borda · 2020-06-23T15:25:46Z

This PR is likely not needed anymore but we need to add you as co-author to #2328 @Borda

to be done, this PR shall be merged to the other PR, but as this is closed and the other merged, there is nothing to do... pls ping me next time before close/merge 🐰

williamFalcon · 2020-06-23T15:26:35Z

yes... but this PR was incorrect

mmiakashs · 2020-06-25T00:00:20Z

@mmiakashs mind trying master now? the solution in this PR wasn't quite 100% right and needed more testing.

This PR is likely not needed anymore but we need to add you as co-author to #2328 @Borda

@williamFalcon Thanks a lot for the PR. One confusion: I just noticed that all the training_step end log metrics are combined with the dict key named 'log_metrics', however, the validation log metrics are combined with the dict key named 'log'. Is this variation intentional?

mmiakashs · 2020-06-25T00:10:37Z

@williamFalcon I debug again and found out that the issue #2320 still occurred only for training_step outputs. training_step outputs for the first optimizer iteration are missing, however, the second optimizer iteration outputs are merged properly.

Borda · 2020-06-25T05:59:37Z

@mmiakashs do you see a fix for it, mind send a PR?

collect all the split batch optimizers iteration batch outputs and pa…

117b7ab

…ss all the batch outputs to training_epoch_end(if user defined this method)

mergify bot requested a review from a team June 23, 2020 04:58

Borda added the bug Something isn't working label Jun 23, 2020

Borda approved these changes Jun 23, 2020

View reviewed changes

mergify bot requested a review from a team June 23, 2020 05:17

Borda added the good first issue Good for newcomers label Jun 23, 2020

williamFalcon mentioned this pull request Jun 23, 2020

refactored training_batch + tests to verify correctness #2328

Merged

Borda added the ready PRs ready to be merged label Jun 23, 2020

williamFalcon closed this Jun 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing training_step outputs in training_epoch_end #2327

Missing training_step outputs in training_epoch_end #2327

mmiakashs commented Jun 23, 2020 •

edited by Borda

Loading

Borda left a comment

codecov bot commented Jun 23, 2020

williamFalcon commented Jun 23, 2020 •

edited

Loading

mergify bot commented Jun 23, 2020

Borda commented Jun 23, 2020

williamFalcon commented Jun 23, 2020

mmiakashs commented Jun 25, 2020

mmiakashs commented Jun 25, 2020

Borda commented Jun 25, 2020

Missing training_step outputs in training_epoch_end #2327

Missing training_step outputs in training_epoch_end #2327

Conversation

mmiakashs commented Jun 23, 2020 • edited by Borda Loading

Borda left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 23, 2020

Codecov Report

williamFalcon commented Jun 23, 2020 • edited Loading

mergify bot commented Jun 23, 2020

Borda commented Jun 23, 2020

williamFalcon commented Jun 23, 2020

mmiakashs commented Jun 25, 2020

mmiakashs commented Jun 25, 2020

Borda commented Jun 25, 2020

mmiakashs commented Jun 23, 2020 •

edited by Borda

Loading

williamFalcon commented Jun 23, 2020 •

edited

Loading