Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

proposals from RPN per image during training #676

Merged
merged 5 commits into from
Apr 19, 2019
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ you'll also need to change the learning rate, the number of iterations and the l

Here is an example for Mask R-CNN R-50 FPN with the 1x schedule:
```bash
python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1
python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1 MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000
```
This follows the [scheduling rules from Detectron.](https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14-L30)
Note that we have multiplied the number of iterations by 8x (as well as the learning rate schedules),
Expand All @@ -138,6 +138,7 @@ and we have divided the learning rate by 8x.
We also changed the batch size during testing, but that is generally not necessary because testing
requires much less memory than training.

Furthermore, we set ```MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000``` as the proposals are selected for per the batch rather than per image. The value is calculated by **1000 x images-per-gpu**. Here we have 2 images per GPU, therefore we set the number as 1000 x 2 = 2000. If we have 8 images per GPU, the value should be set as 8000. See [#672](https://github.com/facebookresearch/maskrcnn-benchmark/issues/672) for more details.

### Multi-GPU training
We use internally `torch.distributed.launch` in order to launch
Expand All @@ -147,8 +148,9 @@ process will only use a single GPU.

```bash
export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml"
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000
```
Note we should set ```MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN``` follow the rule in Single-GPU training.

## Abstractions
For more information on some of the main abstractions in our implementation, see [ABSTRACTIONS.md](ABSTRACTIONS.md).
Expand Down
2 changes: 1 addition & 1 deletion maskrcnn_benchmark/modeling/rpn/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ def select_over_all_levels(self, boxlists):
# and not per batch
if self.training:
objectness = torch.cat(
[boxlist.get_field("objectness") for boxlist in boxlists], dim=0
[boxlist.get_field("objectness") for boxlist in boxlists], dim=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just removing this extra space change and this should be good to merge!

)
box_sizes = [len(boxlist) for boxlist in boxlists]
post_nms_top_n = min(self.fpn_post_nms_top_n, len(objectness))
Expand Down