-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The loss cannot get decreased during the training #10
Comments
This is very strange. If the loss doesn't go down in the first epoch it won't go down afterwards. Try lower learning rates, it seems like the training is diverging. The learning rate you're using should be the correct one so this seems very strange. If anyone has this same problem it'd be nice if they can report it here. |
@developer-mayuan I just managed to run the training without crashing (as per solution that you pointed out) But I seem to be running into the similar problem. This is the output I have:
It just keeps oscillating. I have also tried different values for alpha and lr, but none of those seem to yield any meaningful outcome. |
@numitors My loss is much larger than you. I think you can wait for more epochs to see if the Yaw loss becomes lower. For my case, the Yaw Loss is vibrating around 3000, which means the network totally learned nothing... |
I realized a bug in my modified code, now the algorithm works. I will close this issue. |
@developer-mayuan What was your issue exactly? I also saw the behavior you are describing, depending probably just on initialization.
And it goes like that on and on. |
@numitors I think this issue does related to the initialization. You need to try several times. |
What result can you get training by yourself on 300W_LP dataset? I manage to train the model, but facing the same problem as you met, eg. training is sensitive to the initialization. |
@kalyo-zjl I can get my losses to be very low after 25 epochs (loss is around 15 for each degree). You can try to decrease the learning rate maybe every 15 epochs to see if works. |
@kalyo-zjl By the way, I suggest using tensorboard to visualize the loss instead of just printing the loss to the console. |
Hi @developer-mayuan, |
@kalyo-zjl I just modified my previous response, my loss for each axis is around 15. The training procedure is not very consistent, sometimes you can get a very low losses already after just 200 iteration in yaw angle. |
@developer-mayuan @kalyo-zjl @numitors I fixed a bug in the training code that, combined with the new PyTorch update, was causing training to be very unstable. Please try it again now. |
@developer-mayuan @kalyo-zjl hi guys i needed ur help for following points . Please help
|
@developer-mayuan I want to know your performance tested on ALFW_2000?? I reproduce this project using TensorFlow,the loss of the three is about 2,but when I test on the ALFW_2000 dataset,the mae is about 17,could not get the performance descripted on the paper. |
Hi natanielruiz:
I was trying to repeated your paper's result in recent days however I found I cannot get the loss decreased when I trained your model on 300W_LP dataset. I used the same parameters you provided in your paper where
alpha = 1, lr = 1e-5 and default parameters for Adam Optimizer.
I ran your network for 25 epochs and the losses for Yaw is vibrating around 3000 which means the MSE loss is still too large for the yaw degree.
Do you have any idea how to debug the network or solve this issue? Thank you very much for your help!
The text was updated successfully, but these errors were encountered: