Skip to content

Latest commit

 

History

History
57 lines (44 loc) · 11.5 KB

README.md

File metadata and controls

57 lines (44 loc) · 11.5 KB

KLD

Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Abstract

Existing rotated object detectors are mostly inherited from the horizontal detection paradigm, as the latter has evolved into a well-developed area. However, these detectors are difficult to perform prominently in high-precision detection due to the limitation of current regression loss design, especially for objects with large aspect ratios. Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection. We show that one essential challenge is how to modulate the coupled parameters in the rotation regression loss, as such the estimated parameters can influence to each other during the dynamic joint optimization, in an adaptive and synergetic way. Specifically, we first convert the rotated bounding box into a 2-D Gaussian distribution, and then calculate the Kullback-Leibler Divergence (KLD) between the Gaussian distributions as the regression loss. By analyzing the gradient of each parameter, we show that KLD (and its derivatives) can dynamically adjust the parameter gradients according to the characteristics of the object. For instance, it will adjust the importance (gradient weight) of the angle parameter according to the aspect ratio. This mechanism can be vital for high-precision detection as a slight angle error would cause a serious accuracy drop for large aspect ratios objects. More importantly, we have proved that KLD is scale invariant. We further show that the KLD loss can be degenerated into the popular $l_{n}$-norm loss for horizontal detection. Experimental results on seven datasets using different detectors show its consistent superiority

Results and models

DOTA1.0

Backbone mAP Angle lr schd Mem (GB) Inf Time (fps) Aug Batch Size Configs Download
ResNet50 (1024,1024,200) 64.55 oc 1x 3.38 15.7 - 2 rotated_retinanet_hbb_r50_fpn_1x_dota_oc model | log
ResNet50 (1024,1024,200) 69.94 oc 1x 3.39 15.6 - 2 rotated_retinanet_hbb_kld_r50_fpn_1x_dota_oc model | log
ResNet50 (1024,1024,200) 69.86 oc 1x 3.35 15.8 - 2 rotated_retinanet_hbb_kld_stable_r50_fpn_1x_dota_oc model | log
ResNet50 (1024,1024,200) 68.42 le90 1x 3.38 16.9 - 2 rotated_retinanet_obb_r50_fpn_1x_dota_le90 model | log
ResNet50 (1024,1024,200) 70.22 le90 1x 3.35 16.9 - 2 rotated_retinanet_obb_kld_stable_r50_fpn_1x_dota_le90 model | log
ResNet50 (1024,1024,200) 71.30 le90 1x 3.61 16.9 - 2 rotated_retinanet_obb_kld_stable_r50_adamw_fpn_1x_dota_le90 model | log
ConvNeXt-T (1024,1024,200) 74.49 le90 1x 6.12 - 2 rotated_retinanet_obb_kld_stable_convnext_adamw_fpn_1x_dota_le90 model | log
Backbone mAP Angle lr schd Mem (GB) Inf Time (fps) Aug Batch Size Configs Download
ResNet50 (1024,1024,200) 69.80 oc 1x 3.54 12.4 - 2 r3det_r50_fpn_1x_dota_oc model | log
ResNet50 (1024,1024,200) 71.83 oc 1x 3.54 12.4 - 2 r3det_kld_r50_fpn_1x_dota_oc model | log
ResNet50 (1024,1024,200) 72.12 oc 1x 3.81 13.5 - 2 r3det_kld_stable_r50_fpn_1x_dota_oc model | log
Backbone mAP Angle lr schd Mem (GB) Inf Time (fps) Aug Batch Size Configs Download
ResNet50 (1024,1024,200) 70.18 oc 1x 3.23 15.6 - 2 r3det_tiny_r50_fpn_1x_dota_oc model | log
ResNet50 (1024,1024,200) 72.76 oc 1x 3.44 14.0 - 2 r3det_tiny_kld_r50_fpn_1x_dota_oc model | log

HRSC

Backbone mAP AP50 AP75 Angle lr schd Mem (GB) Inf Time (fps) Aug Batch Size Configs Download
ResNet50 (800,512) 52.06 84.80 58.10 le90 6x 1.56 38.2 RR 2 rotated_retinanet_obb_r50_fpn_6x_hrsc_rr_le90 model | log
ResNet50 (800,512) 54.15 86.20 60.60 le90 6x 1.56 38.2 RR 2 rotated_retinanet_obb_kld_stable_r50_fpn_6x_hrsc_rr_le90 model | log
ResNet50 (800,512) 45.09 79.30 46.90 oc 6x 1.56 39.2 RR 2 rotated_retinanet_hbb_r50_fpn_6x_hrsc_rr_oc model | log
ResNet50 (800,512) 58.17 87.00 69.30 oc 6x 1.56 39.5 RR 2 rotated_retinanet_hbb_kld_stable_r50_fpn_6x_hrsc_rr_oc model | log

Citation

@inproceedings{yang2021learning,
	title={Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence},
	author={Yang, Xue and Yang, Xiaojiang and Yang, Jirui and Ming, Qi and Wang, Wentao and Tian, Qi and Yan, Junchi},
	booktitle={Advances in Neural Information Processing Systems},
	year={2021}
}