Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModelCheckpoint epoch and progress bar epoch mismatch #1946

Closed
versatran01 opened this issue May 25, 2020 · 6 comments
Closed

ModelCheckpoint epoch and progress bar epoch mismatch #1946

versatran01 opened this issue May 25, 2020 · 6 comments
Labels
help wanted Open to be worked on won't fix This will not be worked on

Comments

@versatran01
Copy link

🐛 Bug

Progress bar epoch starts at 1, but in ModelCheckpoint it starts at 0.

To Reproduce

Steps to reproduce the behavior:

https://github.com/PyTorchLightning/pytorch-lightning/blob/8ca8336ce52ee7379f4d399520636143eb31018b/pytorch_lightning/callbacks/progress.py#L320

https://github.com/PyTorchLightning/pytorch-lightning/blob/8ca8336ce52ee7379f4d399520636143eb31018b/pytorch_lightning/callbacks/model_checkpoint.py#L212

Expected behavior

Shouldn't they be consistent?

@versatran01 versatran01 added the help wanted Open to be worked on label May 25, 2020
@williamFalcon
Copy link
Contributor

@Borda FYI.
in 0.8.0 we are setting all indexing to be from 0

@versatran01
Copy link
Author

what is the epoch number of the val sanity check? I don't want to write the results of sanity check to my logs, is there a way to check if validation is currently in sanity check?

@versatran01
Copy link
Author

versatran01 commented Jun 19, 2020

This does not seem to be true in 0.8.0. Epoch still starts from 1.
https://github.com/PyTorchLightning/pytorch-lightning/blob/3256fe4e5a405db1ab00d4cf4d48cbbfc7730959/pytorch_lightning/trainer/training_loop.py#L350
This is even worse than before because now all my validation metrics in tensorboard are shifted by 1.

@williamFalcon
Copy link
Contributor

yes just noticed that too... they should start from 0 @Borda

@Borda
Copy link
Member

Borda commented Jun 19, 2020

yes just noticed that too... they should start from 0 @Borda

ohhh that was my misunderstanding, I thought it shall start from 1 and because of the confusion I was asking about the diff indexing in steps and epochs #2206 (comment)

well before it was indexed from 0...

Ok, reverting and moving epoch indexing starting from 0

@stale
Copy link

stale bot commented Aug 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Open to be worked on won't fix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants