Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

Commit

Permalink
remove extra cuda calls (#75)
Browse files Browse the repository at this point in the history
Summary:
Before this PR (facebookresearch/fairscale#543) was merged, we used to need the extra cuda() calls. Now, they are not needed.

Unfortunately, this doesn't solve the long model init time issue we have. A FSDP model init still take >20 mins for me. This is really bad for debugging the regnet128 conv layer crash problem I am debugging.

The following debugging output shows that most delays are in FSDP wrapping, some in BN wrapping and some in the layer wrapping.

```
INFO 2021-04-14 12:18:35,883 regnet_2.py: 159: block created
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:18:35,884 regnet_2.py: 161: cpu
INFO 2021-04-14 12:19:07,388 regnet_2.py: 163: block bn wrapped
INFO 2021-04-14 12:19:18,388 regnet_2.py: 166: block wrapped
```

In any case, this PR is pretty safe and should go in so that we don't need to do an extra `cuda()` call before wrapping.

Pull Request resolved: fairinternal/ssl_scaling#75

Reviewed By: prigoyal

Differential Revision: D27776285

Pulled By: min-xu-ai

fbshipit-source-id: 3e43c6fe750fd6ee35933400b03a069d62040d8a
  • Loading branch information
min-xu-ai authored and facebook-github-bot committed Apr 15, 2021
1 parent 20295c5 commit c29fe66
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions vissl/models/trunks/regnet_fsdp.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def __init__(
bot_mul,
group_width,
params.se_ratio,
).cuda()
)
# Init weight before wrapping and sharding.
init_weights(block)

Expand All @@ -127,7 +127,7 @@ class RegNetFSDP(FSDP):
"""

def __init__(self, model_config: AttrDict, model_name: str):
module = _RegNetFSDP(model_config, model_name).cuda()
module = _RegNetFSDP(model_config, model_name)
super().__init__(module, **model_config.FSDP_CONFIG)


Expand Down

0 comments on commit c29fe66

Please sign in to comment.