-
Notifications
You must be signed in to change notification settings - Fork 34
/
readme_rtdetr.txt
182 lines (105 loc) · 6.76 KB
/
readme_rtdetr.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
----------------------- paddle语法转pytorch -----------------------
class SPP(nn.Layer):
def __init__(...):
super(SPP, self).__init__()
self.pool = []
self.data_format = data_format
for i, size in enumerate(pool_size):
pool = self.add_sublayer(
'pool{}'.format(i),
nn.MaxPool2D(
kernel_size=size,
stride=1,
padding=size // 2,
data_format=data_format,
ceil_mode=False))
self.pool.append(pool)
self.conv = ConvBNLayer(ch_in, ch_out, k, padding=k // 2, act=act)
转成(见 custom_pan.py):
class SPP(nn.Module):
def __init__(...):
super(SPP, self).__init__()
self.pool = []
self.data_format = data_format
for i, size in enumerate(pool_size):
name = 'pool{}'.format(i)
pool = nn.MaxPool2d(
kernel_size=size,
stride=1,
padding=size // 2,
ceil_mode=False)
self.add_module(name, pool)
self.pool.append(pool)
self.conv = ConvBNLayer(ch_in, ch_out, k, padding=k // 2, act=act, act_name=act_name)
即
pool = self.add_sublayer('pool{}'.format(i), nn.MaxPool2D(...))
拆分成
name = 'pool{}'.format(i)
pool = nn.MaxPool2d(...)
self.add_module(name, pool)
这3句代码
----------------------- 转换权重 -----------------------
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_r18vd_dec3_6x_coco.pdparams
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_r34vd_dec4_6x_coco.pdparams
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_r50vd_m_6x_coco.pdparams
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_r50vd_6x_coco.pdparams
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_r101vd_6x_coco.pdparams
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_hgnetv2_l_6x_coco.pdparams
wget https://bj.bcebos.com/v1/paddledet/models/rtdetr_hgnetv2_x_6x_coco.pdparams
wget https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
复现训练时请仔细核对每个参数的 lr、 L2Decay
forward流程:
骨干网络出来3个张量,形状是[N, 512, 80, 80], [N, 1024, 40, 40], [N, 2048, 20, 20],
进入 HybridEncoder,
先分别用3个conv(1x1卷积)+bn进行降维(相关层名字是input_proj),形状变成[N, 256, 80, 80], [N, 256, 40, 40], [N, 256, 20, 20], 256是hidden_dim,
mmdet/models/losses/detr_loss.py
_get_loss_bbox()
可能是boxes被inplace操作,导致无法训练。
与原版的一些不同:
# attn_mask 会进入 MultiHeadAttention, 和 QK^T/sqrt(dk) 相加。
# attn_mask 如果是 True, 则变成 0 , attn_mask 如果是 False, 则变成 负无穷 .
device = attn_mask.device
attn_mask = torch.where(
attn_mask.to(torch.bool),
torch.zeros(attn_mask.shape, dtype=tgt.dtype, device=device),
torch.ones(attn_mask.shape, dtype=tgt.dtype, device=device) * -100000.)
# 这里和paddle不同,暂时设为较大的数
# anchors = torch.where(valid_mask, anchors, paddle.to_tensor(float("inf")))
anchors = torch.where(valid_mask, anchors, torch.ones_like(anchors) * 100000.)
encoder_layer(TransformerLayer) 被放进 HybridEncoder 的 self.encoder(nn.ModuleList)
self.encoder里有1个元素,类型是 TransformerEncoder(encoder_layer, num_encoder_layers)
python tools/convert_weights.py -f exps/rtdetr/rtdetr_r18vd_6x_coco.py -c rtdetr_r18vd_dec3_6x_coco.pdparams -oc rtdetr_r18vd_dec3_6x_coco.pth -nc 80 --device gpu
python tools/convert_weights.py -f exps/rtdetr/rtdetr_r50vd_6x_coco.py -c rtdetr_r50vd_6x_coco.pdparams -oc rtdetr_r50vd_6x_coco.pth -nc 80 --device gpu
python tools/convert_weights.py -f exps/rtdetr/rtdetr_r50vd_6x_coco.py -c ResNet50_vd_ssld_v2_pretrained.pdparams -oc ResNet50_vd_ssld_v2_pretrained.pth -nc 80 --only_backbone True
----------------------- 预测 -----------------------
python tools/demo.py image -f exps/rtdetr/rtdetr_r18vd_6x_coco.py -c rtdetr_r18vd_dec3_6x_coco.pth --path assets/000000000019.jpg --conf 0.15 --tsize 640 --save_result --device gpu
python tools/demo.py image -f exps/rtdetr/rtdetr_r50vd_6x_coco.py -c rtdetr_r50vd_6x_coco.pth --path assets/000000000019.jpg --conf 0.15 --tsize 640 --save_result --device gpu
----------------------- 训练 -----------------------
----------------------- 迁移学习,带上-c(--ckpt)参数读取预训练模型。 -----------------------
后台启动:
nohup xxx > ppyolo.log 2>&1 &
- - - - - - - - - - - - - - - - - - - - - -
export CUDA_VISIBLE_DEVICES=0,1
nohup python tools/train.py -f exps/rtdetr/rtdetr_r18vd_6x_voc2012.py -d 2 -b 24 -eb 24 -w 4 -ew 4 -lrs 0.1 -c rtdetr_r18vd_dec3_6x_coco.pth > rtdetr_r18vd_6x_voc2012.log 2>&1 &
python tools/train.py -f exps/rtdetr/rtdetr_r18vd_6x_voc2012.py -d 2 -b 24 -eb 24 -w 4 -ew 4 -lrs 1.0 -c rtdetr_r18vd_dec3_6x_coco.pth
python tools/train.py -f exps/rtdetr/rtdetr_r18vd_6x_voc2012.py -d 1 -b 24 -eb 24 -w 4 -ew 4 -lrs 0.1 -c rtdetr_r18vd_dec3_6x_coco.pth
python tools/train.py -f exps/rtdetr/rtdetr_r18vd_6x_voc2012.py -d 1 -b 2 -eb 2 -w 0 -ew 0 -c rtdetr_r18vd_dec3_6x_coco.pth
----------------------- 恢复训练(加上参数--resume) -----------------------
----------------------- 评估 -----------------------
python tools/eval.py -f exps/rtdetr/rtdetr_r50vd_6x_coco.py -d 1 -b 8 -w 4 -c rtdetr_r50vd_6x_coco.pth --tsize 640
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.532
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.714
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.578
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.348
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.580
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.701
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.391
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.656
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.722
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.547
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.765
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.880
----------------------- 导出为ONNX -----------------------
----------------------- 导出为TensorRT -----------------------
----------------------- 复现COCO上的精度 -----------------------