Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor]Refactor exporting One-Stage model to ONNX #6003

Merged
merged 12 commits into from
Oct 13, 2021

Conversation

jshilong
Copy link
Collaborator

@jshilong jshilong commented Sep 1, 2021

Motivation

The recent ONNX-related development is too quickly, which made the code hard to read, and there is a problem that the return type of the same function such as _get_bboxes in the DenseHead.,

Modification

This PR moves all ONNX related code of the One-Stage model to a new function onnx_export in the corresponding class.

  • FCOS
  • FSAF
  • RetinaNet
  • SSD
  • YOLOv3
  • CornerNet
  • DETR

BC-breaking (Optional)

None

* revert batch to single

* update anchor_head

* replace preds with bboxes

* add point_bbox_coder

* FCOS add get_selected_priori

* unified anchor-free and anchor-based get_bbox_single

* update code

* update reppoints and sabl

* add sparse priors

* add mlvlpointsgenerator

* revert __init__ of core

* refactor reppoints

* delete label channal

* add docstr

* fix typo

* fix args

* fix typo

* fix doc

* fix stride_h

* add offset

* Unified bbox coder

* add offset

* remove point_bbox_coder.py

* fix docstr

* new interface of single_proir

* fix device

* add unitest

* add cuda unitest

* add more cuda unintest

* fix reppoints

* fix device

* update all prior

* update vfnet

* add unintest for ssd and yolo and rename prior_idxs

* add docstr for MlvlPointGenerator

* update reppoints and rpnhead

* add space

* add num_base_priors

* update some model

* update docstr

* fixAugFPN test and lint.

* Fix autoassign

* add docs

* Unified fcos decoding

* update docstr

* fix train error

* Fix Vfnet

* Fix some

* update centernet

* revert

* add warnings

* fix unittest error

* delete duplicated

* fix comment

* fix docs

* fix type

Co-authored-by: zhangshilong <2392587229zsl@gmail.com>
@jshilong jshilong changed the base branch from master to refactor_dense September 1, 2021 08:50
@jshilong jshilong changed the title Onestage onnx [Refactor]Refactor exporting One-Stage model to ONNX Sep 1, 2021
@jshilong jshilong added the WIP Working in progress label Sep 1, 2021
mmdet/core/anchor/anchor_generator.py Outdated Show resolved Hide resolved
mmdet/core/anchor/anchor_generator.py Outdated Show resolved Hide resolved
mmdet/core/anchor/anchor_generator.py Show resolved Hide resolved
mmdet/core/anchor/anchor_generator.py Show resolved Hide resolved
mmdet/core/anchor/point_generator.py Show resolved Hide resolved
mmdet/core/anchor/point_generator.py Outdated Show resolved Hide resolved
mmdet/models/detectors/single_stage.py Outdated Show resolved Hide resolved
@ZwwWayne
Copy link
Collaborator

ZwwWayne commented Sep 4, 2021

some remaining issues, like num_base_anchors should also be fixed with comments

@jshilong
Copy link
Collaborator Author

jshilong commented Sep 4, 2021

some remaining issues, like num_base_anchors should also be fixed with comments

This pr is still working in progress, any suggestion is appreciated

@RunningLeon RunningLeon self-requested a review September 6, 2021 03:15
@jshilong
Copy link
Collaborator Author

jshilong commented Sep 9, 2021

some remaining issues, like num_base_anchors should also be fixed with comments

Renaming the attribute num_anchors to num_base_priors will affect the training, I suggest doing it in the future when all models change to prior_generator

@ZwwWayne
Copy link
Collaborator

some remaining issues, like num_base_anchors should also be fixed with comments

Renaming the attribute num_anchors to num_base_priors will affect the training, I suggest doing it in the future when all models change to prior_generator

Sure, we can do that in the next PR.

@RangiLyu RangiLyu mentioned this pull request Sep 13, 2021
@jshilong
Copy link
Collaborator Author

@jshilong Seems somewhere in fcos has used torch.arange with non-int64 input, which makes onnx2tensorrt failed.

configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py
checkpoints/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco_batch.onnx
--trt-file
checkpoints/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco_batch.trt
--input-img
data/blueangels.jpg
--show
--verbose
--verify
--workspace-size 1
--max-shape 1344
--shape 400 600

[TensorRT] VERBOSE: ModelImporter.cpp:125: Range_675 [Range] inputs: [2512 -> ()], [1180 -> ()], [2513 -> ()],
Traceback (most recent call last):
File "/home/PJLAB/maningsheng/projects/openmmlab/mmdetection/tools/deployment/onnx2tensorrt.py", line 254, in
verbose=args.verbose)
File "/home/PJLAB/maningsheng/projects/openmmlab/mmdetection/tools/deployment/onnx2tensorrt.py", line 45, in onnx2tensorrt
max_workspace_size=max_workspace_size)
File "/home/PJLAB/maningsheng/projects/openmmlab/mmcv-pt1.8/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
RuntimeError: parse onnx failed:
In node -1 (importRange): UNSUPPORTED_NODE: Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"

I will check it

@jshilong
Copy link
Collaborator Author

@jshilong Seems somewhere in fcos has used torch.arange with non-int64 input, which makes onnx2tensorrt failed.

configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py
checkpoints/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco_batch.onnx
--trt-file
checkpoints/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco_batch.trt
--input-img
data/blueangels.jpg
--show
--verbose
--verify
--workspace-size 1
--max-shape 1344
--shape 400 600

[TensorRT] VERBOSE: ModelImporter.cpp:125: Range_675 [Range] inputs: [2512 -> ()], [1180 -> ()], [2513 -> ()],
Traceback (most recent call last):
File "/home/PJLAB/maningsheng/projects/openmmlab/mmdetection/tools/deployment/onnx2tensorrt.py", line 254, in
verbose=args.verbose)
File "/home/PJLAB/maningsheng/projects/openmmlab/mmdetection/tools/deployment/onnx2tensorrt.py", line 45, in onnx2tensorrt
max_workspace_size=max_workspace_size)
File "/home/PJLAB/maningsheng/projects/openmmlab/mmcv-pt1.8/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
RuntimeError: parse onnx failed:
In node -1 (importRange): UNSUPPORTED_NODE: Assertion failed: inputs.at(0).isInt32() && "For range operator with dynamic inputs, this version of TensorRT only supports INT32!"

Would you mind helping to retest it? I may have fixed it in point_generator

@VVsssssk
Copy link
Contributor

VVsssssk commented Oct 12, 2021

Hello,When I have test this pr,I tranform python2onnx,all model is successed.But when I tranform onnx2trt,only fcos success.

ERR LOG:
(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/fsaf/fsaf_r50_fpn_1x_coco.py tmp/fsaf.onnx --trt-file='fsaf.trt' --input-img='tests/data/color.jpg' --shape 800 1216tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases.
warnings.warn(
Traceback (most recent call last):
File "tools/deployment/onnx2tensorrt.py", line 247, in
onnx2tensorrt(
File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt
trt_engine = onnx2trt(
File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
RuntimeError: parse onnx failed:
In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/retinanet/retinanet_r50_fpn_1x_coco.py tmp/retinanet.onnx --trt-file='retinanet.trt' --input-img='tests/data/color.jpg' --shape 800 1216
tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases.
warnings.warn(
Traceback (most recent call last):
File "tools/deployment/onnx2tensorrt.py", line 247, in
onnx2tensorrt(
File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt
trt_engine = onnx2trt(
File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
RuntimeError: parse onnx failed:
In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/ssd/ssd300_coco.py tmp/ssd.onnx --trt-file='ssd.trt' --input-img='tests/data/color.jpg' --shape 800 1216tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases.
warnings.warn(
Traceback (most recent call last):
File "tools/deployment/onnx2tensorrt.py", line 247, in
onnx2tensorrt(
File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt
trt_engine = onnx2trt(
File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
RuntimeError: parse onnx failed:
In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/yolo/yolov3_d53_320_273e_coco.py tmp/yolov3.onnx --trt-file='yolov3.trt' --input-img='tests/data/color.jpg' --shape 800 1216tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases.
warnings.warn(
Traceback (most recent call last):
File "tools/deployment/onnx2tensorrt.py", line 247, in
onnx2tensorrt(
File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt
trt_engine = onnx2trt(
File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt
raise RuntimeError(f'parse onnx failed:\n{error_msgs}')
RuntimeError: parse onnx failed:
In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

@jshilong
Copy link
Collaborator Author

Hello,When I have test this pr,I tranform python2onnx,all model is successed.But when I tranform onnx2trt,only fcos success.

ERR LOG: (pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/fsaf/fsaf_r50_fpn_1x_coco.py tmp/fsaf.onnx --trt-file='fsaf.trt' --input-img='tests/data/color.jpg' --shape 800 1216tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases. warnings.warn( Traceback (most recent call last): File "tools/deployment/onnx2tensorrt.py", line 247, in onnx2tensorrt( File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt trt_engine = onnx2trt( File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt raise RuntimeError(f'parse onnx failed:\n{error_msgs}') RuntimeError: parse onnx failed: In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/retinanet/retinanet_r50_fpn_1x_coco.py tmp/retinanet.onnx --trt-file='retinanet.trt' --input-img='tests/data/color.jpg' --shape 800 1216 tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases. warnings.warn( Traceback (most recent call last): File "tools/deployment/onnx2tensorrt.py", line 247, in onnx2tensorrt( File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt trt_engine = onnx2trt( File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt raise RuntimeError(f'parse onnx failed:\n{error_msgs}') RuntimeError: parse onnx failed: In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/ssd/ssd300_coco.py tmp/ssd.onnx --trt-file='ssd.trt' --input-img='tests/data/color.jpg' --shape 800 1216tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases. warnings.warn( Traceback (most recent call last): File "tools/deployment/onnx2tensorrt.py", line 247, in onnx2tensorrt( File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt trt_engine = onnx2trt( File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt raise RuntimeError(f'parse onnx failed:\n{error_msgs}') RuntimeError: parse onnx failed: In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

(pt1.8) PJLAB\shenkun@shai14001070l:~/workspace/mmdetection$ python tools/deployment/onnx2tensorrt.py configs/yolo/yolov3_d53_320_273e_coco.py tmp/yolov3.onnx --trt-file='yolov3.trt' --input-img='tests/data/color.jpg' --shape 800 1216tools/deployment/onnx2tensorrt.py:199: UserWarning: Arguments like --to-rgb, --mean, --std, --dataset would be parsed directly from config file and are deprecated and will be removed in future releases. warnings.warn( Traceback (most recent call last): File "tools/deployment/onnx2tensorrt.py", line 247, in onnx2tensorrt( File "tools/deployment/onnx2tensorrt.py", line 40, in onnx2tensorrt trt_engine = onnx2trt( File "/home/PJLAB/shenkun/workspace/mmcv/mmcv/tensorrt/tensorrt_utils.py", line 63, in onnx2trt raise RuntimeError(f'parse onnx failed:\n{error_msgs}') RuntimeError: parse onnx failed: In node -1 (convertAxis): UNSUPPORTED_NODE: Assertion failed: axis >= 0 && axis < nbDims

This PR works fine under PyTorch 1.6, This problem only appears in the higher version PyTorch.

@ZwwWayne ZwwWayne merged commit c5a7b08 into open-mmlab:refactor_dense Oct 13, 2021
jshilong added a commit that referenced this pull request Oct 25, 2021
* Refactor one-stage get_bboxes logic (#5317)

* revert batch to single

* update anchor_head

* replace preds with bboxes

* add point_bbox_coder

* FCOS add get_selected_priori

* unified anchor-free and anchor-based get_bbox_single

* update code

* update reppoints and sabl

* add sparse priors

* add mlvlpointsgenerator

* revert __init__ of core

* refactor reppoints

* delete label channal

* add docstr

* fix typo

* fix args

* fix typo

* fix doc

* fix stride_h

* add offset

* Unified bbox coder

* add offset

* remove point_bbox_coder.py

* fix docstr

* new interface of single_proir

* fix device

* add unitest

* add cuda unitest

* add more cuda unintest

* fix reppoints

* fix device

* update all prior

* update vfnet

* add unintest for ssd and yolo and rename prior_idxs

* add docstr for MlvlPointGenerator

* update reppoints and rpnhead

* add space

* add num_base_priors

* update some model

* update docstr

* fixAugFPN test and lint.

* Fix autoassign

* add docs

* Unified fcos decoding

* update docstr

* fix train error

* Fix Vfnet

* Fix some

* update centernet

* revert

* add warnings

* fix unittest error

* delete duplicated

* fix comment

* fix docs

* fix type

Co-authored-by: zhangshilong <2392587229zsl@gmail.com>

* support onnx export for fcos

* support onnx export for fcos fsaf retina and ssd

* resolve comments

* resolve comments

* add with nms

* support cornernet

* resolve comments

* add default with nms

* fix trt arrange should be int

Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com>
jshilong added a commit that referenced this pull request Oct 26, 2021
* Refactor one-stage get_bboxes logic (#5317)

* revert batch to single

* update anchor_head

* replace preds with bboxes

* add point_bbox_coder

* FCOS add get_selected_priori

* unified anchor-free and anchor-based get_bbox_single

* update code

* update reppoints and sabl

* add sparse priors

* add mlvlpointsgenerator

* revert __init__ of core

* refactor reppoints

* delete label channal

* add docstr

* fix typo

* fix args

* fix typo

* fix doc

* fix stride_h

* add offset

* Unified bbox coder

* add offset

* remove point_bbox_coder.py

* fix docstr

* new interface of single_proir

* fix device

* add unitest

* add cuda unitest

* add more cuda unintest

* fix reppoints

* fix device

* update all prior

* update vfnet

* add unintest for ssd and yolo and rename prior_idxs

* add docstr for MlvlPointGenerator

* update reppoints and rpnhead

* add space

* add num_base_priors

* update some model

* update docstr

* fixAugFPN test and lint.

* Fix autoassign

* add docs

* Unified fcos decoding

* update docstr

* fix train error

* Fix Vfnet

* Fix some

* update centernet

* revert

* add warnings

* fix unittest error

* delete duplicated

* fix comment

* fix docs

* fix type

Co-authored-by: zhangshilong <2392587229zsl@gmail.com>

* support onnx export for fcos

* support onnx export for fcos fsaf retina and ssd

* resolve comments

* resolve comments

* add with nms

* support cornernet

* resolve comments

* add default with nms

* fix trt arrange should be int

Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com>
ZwwWayne pushed a commit that referenced this pull request Oct 28, 2021
* Refactor one-stage get_bboxes logic (#5317)

* revert batch to single

* update anchor_head

* replace preds with bboxes

* add point_bbox_coder

* FCOS add get_selected_priori

* unified anchor-free and anchor-based get_bbox_single

* update code

* update reppoints and sabl

* add sparse priors

* add mlvlpointsgenerator

* revert __init__ of core

* refactor reppoints

* delete label channal

* add docstr

* fix typo

* fix args

* fix typo

* fix doc

* fix stride_h

* add offset

* Unified bbox coder

* add offset

* remove point_bbox_coder.py

* fix docstr

* new interface of single_proir

* fix device

* add unitest

* add cuda unitest

* add more cuda unintest

* fix reppoints

* fix device

* update all prior

* update vfnet

* add unintest for ssd and yolo and rename prior_idxs

* add docstr for MlvlPointGenerator

* update reppoints and rpnhead

* add space

* add num_base_priors

* update some model

* update docstr

* fixAugFPN test and lint.

* Fix autoassign

* add docs

* Unified fcos decoding

* update docstr

* fix train error

* Fix Vfnet

* Fix some

* update centernet

* revert

* add warnings

* fix unittest error

* delete duplicated

* fix comment

* fix docs

* fix type

Co-authored-by: zhangshilong <2392587229zsl@gmail.com>

* support onnx export for fcos

* support onnx export for fcos fsaf retina and ssd

* resolve comments

* resolve comments

* add with nms

* support cornernet

* resolve comments

* add default with nms

* fix trt arrange should be int

Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com>
ZwwWayne pushed a commit to ZwwWayne/mmdetection that referenced this pull request Jul 19, 2022
* Refactor one-stage get_bboxes logic (open-mmlab#5317)

* revert batch to single

* update anchor_head

* replace preds with bboxes

* add point_bbox_coder

* FCOS add get_selected_priori

* unified anchor-free and anchor-based get_bbox_single

* update code

* update reppoints and sabl

* add sparse priors

* add mlvlpointsgenerator

* revert __init__ of core

* refactor reppoints

* delete label channal

* add docstr

* fix typo

* fix args

* fix typo

* fix doc

* fix stride_h

* add offset

* Unified bbox coder

* add offset

* remove point_bbox_coder.py

* fix docstr

* new interface of single_proir

* fix device

* add unitest

* add cuda unitest

* add more cuda unintest

* fix reppoints

* fix device

* update all prior

* update vfnet

* add unintest for ssd and yolo and rename prior_idxs

* add docstr for MlvlPointGenerator

* update reppoints and rpnhead

* add space

* add num_base_priors

* update some model

* update docstr

* fixAugFPN test and lint.

* Fix autoassign

* add docs

* Unified fcos decoding

* update docstr

* fix train error

* Fix Vfnet

* Fix some

* update centernet

* revert

* add warnings

* fix unittest error

* delete duplicated

* fix comment

* fix docs

* fix type

Co-authored-by: zhangshilong <2392587229zsl@gmail.com>

* support onnx export for fcos

* support onnx export for fcos fsaf retina and ssd

* resolve comments

* resolve comments

* add with nms

* support cornernet

* resolve comments

* add default with nms

* fix trt arrange should be int

Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor WIP Working in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants