Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partially freeze SWIN backbone #8208

Open
levan92 opened this issue Jun 17, 2022 · 4 comments
Open

Partially freeze SWIN backbone #8208

levan92 opened this issue Jun 17, 2022 · 4 comments
Assignees

Comments

@levan92
Copy link
Contributor

levan92 commented Jun 17, 2022

Hi, can I check how we can partially freeze a SWIN backbone? I've tried adding frozen_stages=3 to the backbone in the config, but met with the following error:

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicate
s that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by
making sure all `forward` function outputs participate in calculating loss.
If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors
in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable).
Parameter indices which did not receive grad for rank 0: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 325 326

Is there something else that needs to be set? Thank you!

@levan92
Copy link
Contributor Author

levan92 commented Jun 24, 2022

@BIGWangYuDong any updates on this?

@levan92
Copy link
Contributor Author

levan92 commented Aug 16, 2022

any updates? @BIGWangYuDong

@austinmw
Copy link
Contributor

austinmw commented Apr 12, 2023

I get the same error when using frozen_stages with CSPNeXt .

I believe this can be fixed by adding self._freeze_stages() at the end of the backbone __init__ methods, similar to the resnet code (here for swin and here for CSPNeXt).

@gachiemchiep
Copy link

@austinmw I got the same error. Your code fix it. Thank you very much

liuchang0523 added a commit to liuchang0523/mmyolo that referenced this issue Dec 7, 2023
Solve the problem:
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicate
s that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by
making sure all `forward` function outputs participate in calculating loss.

reference:
open-mmlab/mmdetection#8208 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants