Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what does the training loss curve look like #27

Open
ghost opened this issue Nov 19, 2019 · 5 comments
Open

what does the training loss curve look like #27

ghost opened this issue Nov 19, 2019 · 5 comments

Comments

@ghost
Copy link

ghost commented Nov 19, 2019

I'm trying to train SSN via train_ssn.py, but after running ~40,000 iterations there seems to be a lot of jittering but no meaningful decrease in the training loss. I know from reading previous issues that convergence takes ~500,000K iterations, but with my computing resources it would take a few days to reach convergence.

So I was wondering whether the authors could kindly tell me / show me what the training loss curve looks like as a function of iteration number, starting from iteration 0 all the way to convergence.

Thank you in advance.

@varunjampani
Copy link
Contributor

Sorry that I don't have any saved logs or plots for this. I might not find time soon to re-train to produce these plots.

@ghost
Copy link
Author

ghost commented Nov 20, 2019

Would it be ok if I instead ask the following two questions?

  1. is it not unusual that the training loss sometimes increases during the early iterations? Is it possible for the training loss to decrease after it increases early on?

  2. how many iterations does it takes for the training loss to start decreasing in an meaningful way?

@varunjampani
Copy link
Contributor

If I remember correctly, Adam optimization results in a training curve with ups and downs. I do not remember how much ups and downs the loss is fluctuating.

@CYang0515
Copy link

@coarsesand In my experiments, the position loss will increase along with iterations, and the reconstruction loss will decrease gradually.

@ghost
Copy link
Author

ghost commented Jan 19, 2020

@CYang0515 does decrease in reconstruction loss eventually overpower the increase in position loss, resulting in a decrease in the overall combined loss with increasing number of iterations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants