Device #1790

williamFalcon · 2020-05-12T03:33:01Z

add self.device pointer to lightningModule

codecov · 2020-05-12T04:05:54Z

Codecov Report

Merging #1790 into master will decrease coverage by 0%.
The diff coverage is 62%.

@@          Coverage Diff           @@
##           master   #1790   +/-   ##
======================================
- Coverage      88%     88%   -0%     
======================================
  Files          69      69           
  Lines        4304    4312    +8     
======================================
+ Hits         3796    3801    +5     
- Misses        508     511    +3

awaelchli · 2020-05-12T04:20:31Z

If the LightningModule gets used in a context outside of Lightning (simply as an nn.Module) then moving the module with .to(...) could break the user's code if they internally use self.device as for example in the GAN etc. We should probably override the .to and cuda methods so that self.device get's updated, so the trainer wouldn't have to update it?

awaelchli · 2020-05-12T04:22:55Z

pytorch_lightning/trainer/distrib_parts.py

        model.to(xm.xla_device())
+        self.device = xm.xla_device()


for example here, we could simply override the .to in LightningModule and no extra code in the Trainer is necessary

the code would be in one place and therefore easier to maintain

also, .to calls submodules too, so this approach would automatically take care of nested LightingModules!

oh wait... does nn.Module have a .device??
if so, i don't think we should overwrite no? i thought the weights had it not the module.

i need to sleep lol.. so, i can think about it tomorrow... but sounds interesting :)

no it doesn't have it. see here:
pytorch/pytorch#7460
I guess if we do it then we would run into issues when the user starts to move their submodules to different devices by hand. but that would anyway be a problem :)
yes let's do it tomorrow :)

I think we could do this as a read-only property similar to what I did on metrics. But I also agree, this should be read-only and not be used for device transfers

exactly like you did in metrics, that's exactly what I meant 👍 Nice

…n DeviceStatsMonitor Minor refactor to use the strategy's own `root_device` instead of the LightningModule's device property. Attempts at manual model parallelization by extending this plugin will face difficulties with the assumption that the LightningModule has all of its parameters on the same device. For those use cases, it is critical to remove the assumption that the module has a device property (device in general goes against PyTorch module's design principles: - pytorch/pytorch#7460 - #1790 (comment)

* Use trainer.strategy.root_device in favor of LightningModule.device in DeviceStatsMonitor Minor refactor to use the strategy's own `root_device` instead of the LightningModule's device property. Attempts at manual model parallelization by extending this plugin will face difficulties with the assumption that the LightningModule has all of its parameters on the same device. For those use cases, it is critical to remove the assumption that the module has a device property (device in general goes against PyTorch module's design principles: - pytorch/pytorch#7460 - #1790 (comment)

williamFalcon added 2 commits May 11, 2020 23:33

added self.device

18474db

added docs

f0b350f

williamFalcon force-pushed the device branch from 566d48a to f0b350f Compare May 12, 2020 03:33

mergify bot requested a review from a team May 12, 2020 03:33

williamFalcon merged commit 4b30ef6 into master May 12, 2020

awaelchli reviewed May 12, 2020

View reviewed changes

mergify bot requested a review from a team May 12, 2020 04:23

Borda deleted the device branch May 12, 2020 06:03

Borda added the feature Is an improvement or enhancement label May 12, 2020

Borda added this to the 0.7.6 milestone May 12, 2020

Borda mentioned this pull request May 12, 2020

device property #1791

Merged

ananthsub mentioned this pull request Feb 4, 2022

Update parallel.py to use root_device instead of lightning_module.device #11734

Merged

8 tasks

ananthsub mentioned this pull request Feb 5, 2022

Use root_device in DeviceStatsMonitor callback #11748

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Device #1790

Device #1790

williamFalcon commented May 12, 2020

codecov bot commented May 12, 2020

awaelchli commented May 12, 2020

awaelchli May 12, 2020

awaelchli May 12, 2020

awaelchli May 12, 2020

williamFalcon May 12, 2020

williamFalcon May 12, 2020

awaelchli May 12, 2020

justusschock May 12, 2020

awaelchli May 12, 2020

Device #1790

Device #1790

Conversation

williamFalcon commented May 12, 2020

codecov bot commented May 12, 2020

Codecov Report

awaelchli commented May 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment