-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'val Objectness' overfitting on j-series hyperparameters #453
Comments
This was referenced Aug 14, 2019
glenn-jocher
changed the title
'val Confidence' overfitting on j-series hyperparameters
'val Objectness' overfitting on j-series hyperparameters
Aug 25, 2019
#472 416 no multiscale. #310 (comment) 320 multiscale |
Problem resolved with new hyperparameters. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@ktian08 evolved the current j-series hyperparameters which were committed in early August, replacing the previous i-series parameters. They generally perform well (to within 1% of darknet training from scratch(!)) but they appear to cause overfitting in validation Confidence in particular.
yolov3/train.py
Lines 36 to 55 in 907195d
See #446 (comment) by @phino10 and #310 (comment) by @Aurora33 for two overfitting examples.
One thing I notice is that
hyp['obj_pw'] = 3.941
is very high compared tohyp['cls_pw'] = 1.446
. This may cause aggressive performance gains at the beginning of training at the expense of overfitting later in training. One option would be to manually lowerhyp['obj_pw']
and to increasehyp['obj']
to compensate, perhaps in equal measure (i.e. divide by 2 and multiply by 2).The text was updated successfully, but these errors were encountered: