-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix nan/inf loss #490
fix nan/inf loss #490
Conversation
@Laughing-q I am still getting nan in my training. It seems for validation is solved: After running
|
@hdnh2006 wait for a while till this gets merged. We'll release the updated package later today |
well done @Laughing-q |
@Laughing-q do you know what a typical value of Should we add a smaller value for a protected divide, i.e. |
@Laughing-q I debugged this value on COCO128 and it's very large, i.e. target_scores_sum = 1000 at batch-size 16, so I think this is fine to max at 1. |
@glenn-jocher oh yes, I just want to tell you this. |
@AyushExel @glenn-jocher this PR fixed the nan loss issue. And the cause is that the
target_scores_sum
we're using in loss calculation could be 0 if there're no objects in targets(empty labels, background).test command:
before fix(losses are nan and mAP stays 0.15):
![pic-selected-230119-1453-35](https://user-images.githubusercontent.com/61612323/213380420-78b7355b-d4d9-411c-ba63-ea86f1a785d3.png)
![pic-selected-230119-1521-59](https://user-images.githubusercontent.com/61612323/213380429-c9f49799-e642-47f5-b631-4a0d1c43fb4f.png)
after fix:
🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Enhanced stability in loss calculation during model training.
📊 Key Changes
target_scores_sum
by ensuring it's never less than 1.🎯 Purpose & Impact