Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SQR for Deformable DETR #8579

Merged
merged 9 commits into from
Sep 24, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions configs/sqr/_base_/deformable_detr_sqr_r50.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
architecture: DETR
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vb_normal_pretrained.pdparams
hidden_dim: 256
use_focal_loss: True


DETR:
backbone: ResNet
transformer: QRDeformableTransformer
detr_head: DeformableDETRHead
post_process: DETRPostProcess


ResNet:
# index 0 stands for res2
depth: 50
norm_type: bn
freeze_at: 0
return_idx: [1, 2, 3]
lr_mult_list: [0.0, 0.1, 0.1, 0.1]
num_stages: 4


QRDeformableTransformer:
num_queries: 300
position_embed_type: sine
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 1024
dropout: 0.1
activation: relu
num_feature_levels: 4
num_encoder_points: 4
num_decoder_points: 4
start_q: [0, 0, 1, 2, 4, 7, 12]
end_q: [1, 2, 4, 7, 12, 20, 33]


DeformableDETRHead:
num_mlp_layers: 3


DETRLoss:
loss_coeff: {class: 2, bbox: 5, giou: 2}
aux_loss: True


HungarianMatcher:
matcher_coeff: {class: 2, bbox: 5, giou: 2}
44 changes: 44 additions & 0 deletions configs/sqr/_base_/deformable_detr_sqr_reader.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
worker_num: 4
TrainReader:
sample_transforms:
- Decode: {}
- RandomFlip: {prob: 0.5}
- RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
transforms2: [
RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
RandomSizeCrop: { min_size: 384, max_size: 600 },
RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- NormalizeBox: {}
- BboxXYXY2XYWH: {}
- Permute: {}
batch_transforms:
- PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
batch_size: 4
shuffle: true
drop_last: true
collate_batch: false
use_shared_memory: false


EvalReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false


TestReader:
sample_transforms:
- Decode: {}
- Resize: {target_size: [800, 1333], keep_ratio: True}
- NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
- Permute: {}
batch_size: 1
shuffle: false
drop_last: false
16 changes: 16 additions & 0 deletions configs/sqr/_base_/deformable_sqr_optimizer_1x.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
epoch: 50

LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [40]
use_warmup: false

OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001
27 changes: 27 additions & 0 deletions configs/sqr/deformable_detr_sqr_r50_12e_coco.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/deformable_detr_sqr_r50.yml',
'_base_/deformable_detr_sqr_reader.yml',
]
weights: output/deformable_detr_sqr_r50_12e_coco/model_final
find_unused_parameters: True


# a standard 1x schedule
epoch: 12

LearningRate:
base_lr: 0.0002
schedulers:
- !PiecewiseDecay
gamma: 0.1
milestones: [8, 11]
use_warmup: false

OptimizerBuilder:
clip_grad_by_norm: 0.1
regularizer: false
optimizer:
type: AdamW
weight_decay: 0.0001
9 changes: 9 additions & 0 deletions configs/sqr/deformable_detr_sqr_r50_1x_coco.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
_BASE_: [
'../datasets/coco_detection.yml',
'../runtime.yml',
'_base_/deformable_sqr_optimizer_1x.yml',
'_base_/deformable_detr_sqr_r50.yml',
'_base_/deformable_detr_sqr_reader.yml',
]
weights: output/deformable_detr_sqr_r50_1x_coco/model_final
find_unused_parameters: True
8 changes: 3 additions & 5 deletions ppdet/data/transform/operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -956,13 +956,11 @@ def apply(self, sample, context=None):

resize_h = int(im_scale * float(im_shape[0]) + 0.5)
resize_w = int(im_scale * float(im_shape[1]) + 0.5)

im_scale_x = im_scale
im_scale_y = im_scale
else:
resize_h, resize_w = self.target_size
im_scale_y = resize_h / im_shape[0]
im_scale_x = resize_w / im_shape[1]

im_scale_y = resize_h / im_shape[0]
im_scale_x = resize_w / im_shape[1]

if len(im.shape) == 3:
im = self.apply_image(sample['image'], [im_scale_x, im_scale_y])
Expand Down
2 changes: 1 addition & 1 deletion ppdet/metrics/metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def update(self, inputs, outputs):

def accumulate(self):
if len(self.results['bbox']) > 0:
output = "bbox.json"
output = f"bbox_{paddle.distributed.get_rank()}.json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块改的原因是什么

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为了避免多卡eval时,重复写入造成文件损坏。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在的ppdet多卡eval逻辑是不对的 没有mege最终的结果的逻辑 这块还保持原来的逻辑吧

if self.output_eval:
output = os.path.join(self.output_eval, output)
with open(output, 'w') as f:
Expand Down
109 changes: 109 additions & 0 deletions ppdet/modeling/transformers/deformable_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -535,3 +535,112 @@ def forward(self, src_feats, src_mask=None, *args, **kwargs):
level_start_index, mask_flatten, query_embed)

return (hs, memory, reference_points)


class QRDeformableTransformerDecoder(DeformableTransformerDecoder):
def __init__(self, decoder_layer, num_layers,
start_q=None, end_q=None, return_intermediate=False):
super(QRDeformableTransformerDecoder, self).__init__(
decoder_layer, num_layers, return_intermediate=return_intermediate)
self.start_q = start_q
self.end_q = end_q

def forward(self,
tgt,
reference_points,
memory,
memory_spatial_shapes,
memory_level_start_index,
memory_mask=None,
query_pos_embed=None):

if not self.training:
return super(QRDeformableTransformerDecoder, self).forward(
tgt, reference_points,
memory, memory_spatial_shapes,
memory_level_start_index,
memory_mask=memory_mask,
query_pos_embed=query_pos_embed)

batchsize = tgt.shape[0]
query_list_reserve = [tgt]
intermediate = []
for lid, layer in enumerate(self.layers):

start_q = self.start_q[lid]
end_q = self.end_q[lid]
query_list = query_list_reserve.copy()[start_q:end_q]

# prepare for parallel process
output = paddle.concat(query_list, axis=0)
fakesetsize = int(output.shape[0] / batchsize)
reference_points_tiled = reference_points.tile([fakesetsize, 1, 1, 1])

memory_tiled = memory.tile([fakesetsize, 1, 1])
query_pos_embed_tiled = query_pos_embed.tile([fakesetsize, 1, 1])
memory_mask_tiled = memory_mask.tile([fakesetsize, 1])

output = layer(output, reference_points_tiled, memory_tiled,
memory_spatial_shapes, memory_level_start_index,
memory_mask_tiled, query_pos_embed_tiled)

for i in range(fakesetsize):
query_list_reserve.append(output[batchsize*i:batchsize*(i+1)])

if self.return_intermediate:
for i in range(fakesetsize):
intermediate.append(output[batchsize*i:batchsize*(i+1)])

if self.return_intermediate:
return paddle.stack(intermediate)

return output.unsqueeze(0)


@register
class QRDeformableTransformer(DeformableTransformer):

def __init__(self,
num_queries=300,
position_embed_type='sine',
return_intermediate_dec=True,
in_feats_channel=[512, 1024, 2048],
num_feature_levels=4,
num_encoder_points=4,
num_decoder_points=4,
hidden_dim=256,
nhead=8,
num_encoder_layers=6,
num_decoder_layers=6,
dim_feedforward=1024,
dropout=0.1,
activation="relu",
lr_mult=0.1,
pe_temperature=10000,
pe_offset=-0.5,
start_q=None,
end_q=None):
super(QRDeformableTransformer, self).__init__(
num_queries=num_queries,
position_embed_type=position_embed_type,
return_intermediate_dec=return_intermediate_dec,
in_feats_channel=in_feats_channel,
num_feature_levels=num_feature_levels,
num_encoder_points=num_encoder_points,
num_decoder_points=num_decoder_points,
hidden_dim=hidden_dim,
nhead=nhead,
num_encoder_layers=num_encoder_layers,
num_decoder_layers=num_decoder_layers,
dim_feedforward=dim_feedforward,
dropout=dropout,
activation=activation,
lr_mult=lr_mult,
pe_temperature=pe_temperature,
pe_offset=pe_offset)

decoder_layer = DeformableTransformerDecoderLayer(
hidden_dim, nhead, dim_feedforward, dropout, activation,
num_feature_levels, num_decoder_points)
self.decoder = QRDeformableTransformerDecoder(
decoder_layer, num_decoder_layers, start_q, end_q, return_intermediate_dec)
3 changes: 2 additions & 1 deletion ppdet/modeling/transformers/matchers.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,10 @@ def forward(self,
out_bbox.unsqueeze(1) - tgt_bbox.unsqueeze(0)).abs().sum(-1)

# Compute the giou cost betwen boxes
cost_giou = self.giou_loss(
giou_loss = self.giou_loss(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块改的原因是什么 对其他模型通用嘛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块原本的cost_giou是算错的,不过对后面计算linear_sum_assignment没有影响,所以其他模型也没有影响

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议你新增一个flag 其他的模型就不需要重新验证了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议你新增一个flag 其他的模型就不需要重新验证了

我觉得不需要,cost_giou只被用在计算C,C只被用在计算linear_sum_assignment,而cost_giou整体增大一个常数对linear_sum_assignment是没有影响的

bbox_cxcywh_to_xyxy(out_bbox.unsqueeze(1)),
bbox_cxcywh_to_xyxy(tgt_bbox.unsqueeze(0))).squeeze(-1)
cost_giou = giou_loss - 1

# Final cost matrix
C = self.matcher_coeff['class'] * cost_class + \
Expand Down