Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I use SyncBN to train mask-rcnn, but have problem when test the model,can you give me some suggesition to use SyncBN #847

Closed
ztyxd opened this issue Jun 21, 2019 · 4 comments

Comments

@ztyxd
Copy link

ztyxd commented Jun 21, 2019

I use SyncBN to train mask-rcnn, I imitate the GN configs and just change the
norm_cfg = dict(type='GN', requires_grad=True) to norm_cfg = dict(type='SyncBN', requires_grad=True), it works well in the train phase.

But when I want to test the model , there is some problems:
File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/base.py", line 87, in forward return self.forward_test(img, img_meta, **kwargs) File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/base.py", line 79, in forward_test return self.simple_test(imgs[0], img_metas[0], **kwargs) File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 241, in simple_test x = self.extract_feat(img) File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/detectors/cascade_rcnn.py", line 115, in extract_feat x = self.backbone(img) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/media/xaserver/DATA/swl/Projects/RSSRAI2019/mmdetection/mmdet/models/backbones/resnet.py", line 509, in forward x = self.norm1(x) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 455, in forward world_size = torch.distributed.get_world_size(process_group) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 584, in get_world_size return _get_group_size(group) File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 200, in _get_group_size _check_default_pg() File "/home/xaserver/anaconda3/envs/rssrai_mmdetection/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 191, in _check_default_pg "Default process group is not initialized" AssertionError: Default process group is not initialized

The test code works well when i donnot use SyncBN, can you give me some suggesition about this problem?
Thanks a lot

@GYxiaOH
Copy link

GYxiaOH commented Jun 21, 2019

i meet same problem

@GYxiaOH
Copy link

GYxiaOH commented Jun 21, 2019

i meet same problem,when i use high api inference_detector(model,img),
will " File "/search/speech/zhanghongyuan/anaconda3/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 200, in _get_group_size

_check_default_pg()
File "/search/speech/zhanghongyuan/anaconda3/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 191, in _check_default_pg
"Default process group is not initialized"

@hellock
Copy link
Member

hellock commented Jun 21, 2019

SyncBN only works with distributed environment, you may either use tools/dist_test.sh or modify SyncBN to BN manually during testing.

@ztyxd
Copy link
Author

ztyxd commented Jun 21, 2019

Thanks a lot

@ztyxd ztyxd closed this as completed Jun 21, 2019
FANGAreNotGnu pushed a commit to FANGAreNotGnu/mmdetection that referenced this issue Oct 23, 2023
…onfig (open-mmlab#847)

* update

Update basic_v1.py

fix

* try to use electra base by default

* Update text_prediction.py

* Update text_prediction.py

* update

* Update tabular-multimodal-text-others.md

* Update basic_v1.py

* fix comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants