'val Objectness' overfitting on j-series hyperparameters #453

glenn-jocher · 2019-08-14T10:51:26Z

@ktian08 evolved the current j-series hyperparameters which were committed in early August, replacing the previous i-series parameters. They generally perform well (to within 1% of darknet training from scratch(!)) but they appear to cause overfitting in validation Confidence in particular.

yolov3/train.py

Lines 36 to 55 in 907195d

    
           # Training hyperparameters j (50.5 mAP yolov3-320) evolved by @ktian08 https://github.com/ultralytics/yolov3/issues/310 
        
           hyp = {'giou': 1.582,  # giou loss gain 
        
                  'xy': 4.688,  # xy loss gain 
        
                  'wh': 0.1857,  # wh loss gain 
        
                  'cls': 27.76,  # cls loss gain 
        
                  'cls_pw': 1.446,  # cls BCELoss positive_weight 
        
                  'obj': 21.35,  # obj loss gain 
        
                  'obj_pw': 3.941,  # obj BCELoss positive_weight 
        
                  'iou_t': 0.2635,  # iou training threshold 
        
                  'lr0': 0.002324,  # initial learning rate 
        
                  'lrf': -4.,  # final LambdaLR learning rate = lr0 * (10 ** lrf) 
        
                  'momentum': 0.97,  # SGD momentum 
        
                  'weight_decay': 0.0004569,  # optimizer weight decay 
        
                  'hsv_s': 0.5703,  # image HSV-Saturation augmentation (fraction) 
        
                  'hsv_v': 0.3174,  # image HSV-Value augmentation (fraction) 
        
                  'degrees': 1.113,  # image rotation (+/- deg) 
        
                  'translate': 0.06797,  # image translation (+/- fraction) 
        
                  'scale': 0.1059,  # image scale (+/- gain) 
        
                  'shear': 0.5768}  # image shear (+/- deg)

See #446 (comment) by @phino10 and #310 (comment) by @Aurora33 for two overfitting examples.

One thing I notice is that hyp['obj_pw'] = 3.941 is very high compared to hyp['cls_pw'] = 1.446. This may cause aggressive performance gains at the beginning of training at the expense of overfitting later in training. One option would be to manually lower hyp['obj_pw'] and to increase hyp['obj'] to compensate, perhaps in equal measure (i.e. divide by 2 and multiply by 2).

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2019-08-25T16:47:10Z

Another example. Also of note is that objectness validation losses are far higher than training losses, the only one of the 3 loss components to display this behavior.

glenn-jocher · 2019-08-28T12:54:28Z

#472 416 no multiscale.

#310 (comment) 320 multiscale

glenn-jocher · 2019-11-05T00:47:21Z

Problem resolved with new hyperparameters.

This was referenced Aug 14, 2019

Modify build_targets for the test #448

Closed

Train YOLOv3-SPP from scratch to 62.6 mAP@0.5 #310

Closed

Training YOLOv3 Tiny with correct mask #256

Closed

glenn-jocher changed the title ~~'val Confidence' overfitting on j-series hyperparameters~~ 'val Objectness' overfitting on j-series hyperparameters Aug 25, 2019

glenn-jocher mentioned this issue Aug 25, 2019

Have you ever trained the model without weight decay? #469

Closed

glenn-jocher mentioned this issue Sep 25, 2019

Default argument settings to reproduce the mAPs you provide #521

Closed

glenn-jocher closed this as completed Nov 5, 2019

glenn-jocher reopened this Nov 5, 2019

glenn-jocher closed this as completed Nov 5, 2019

glenn-jocher mentioned this issue Nov 12, 2019

Implement Yolo-LSTM (~+4-9 AP) for detection on Video with high mAP and without blinking issues AlexeyAB/darknet#3114

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'val Objectness' overfitting on j-series hyperparameters #453

'val Objectness' overfitting on j-series hyperparameters #453

glenn-jocher commented Aug 14, 2019

glenn-jocher commented Aug 25, 2019

glenn-jocher commented Aug 28, 2019 •

edited

Loading

glenn-jocher commented Nov 5, 2019

'val Objectness' overfitting on j-series hyperparameters #453

'val Objectness' overfitting on j-series hyperparameters #453

Comments

glenn-jocher commented Aug 14, 2019

glenn-jocher commented Aug 25, 2019

glenn-jocher commented Aug 28, 2019 • edited Loading

glenn-jocher commented Nov 5, 2019

glenn-jocher commented Aug 28, 2019 •

edited

Loading