Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA out of memory #1

Open
tolandwehr opened this issue Apr 30, 2020 · 0 comments
Open

RuntimeError: CUDA out of memory #1

tolandwehr opened this issue Apr 30, 2020 · 0 comments

Comments

@tolandwehr
Copy link

Hi there,

I tried adapting your StyleGAN approach (quite unchanged) for my issue (and a smaller test set) and it might work... but it's unfortunately hard to tell, as the process is interrupted at the 32-pixel-level by a "CUDA out of memory". I tried with the suggested solutions here , including the 1/0 approach, none of them with success. Also tried for raising the alpha_batch (and thus reducing the batch_size): No success.

Here is the complete error response:

RuntimeError                              Traceback (most recent call last)
<ipython-input-34-a0dbf2349b14> in <module>
----> 1 learn.fit(3, [1e-4, 1e-3])

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\basic_train.py in fit(self, epochs, lr, wd, callbacks)
    198         else: self.opt.lr,self.opt.wd = lr,wd
    199         callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
--> 200         fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
    201 
    202     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\basic_train.py in fit(epochs, learn, callbacks, metrics)
     99             for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
    100                 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 101                 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
    102                 if cb_handler.on_batch_end(loss): break
    103 

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     28 
     29     if not loss_func: return to_detach(out), to_detach(yb[0])
---> 30     loss = loss_func(out, *yb)
     31 
     32     if opt is not None:

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\vision\gan.py in forward(self, *args)
     46 
     47     def forward(self, *args):
---> 48         return self.generator(*args) if self.gen_mode else self.critic(*args)
     49 
     50     def switch(self, gen_mode:bool=None):

<ipython-input-19-28b775f866bd> in critic(self, real_pred, input)
     13     def critic(self, real_pred, input):
     14         "Create some `fake_pred` with the generator from `input` and compare them to `real_pred` in `self.loss_funcD`."
---> 15         fake = self.gan_model.generator(input.requires_grad_(False)).requires_grad_(True)
     16         fake_pred = self.gan_model.critic(fake, actual=True)
     17 

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-16-d156c49088a2> in forward(self, input, noise, mean_style, style_weight, mixing_range, mixing)
     51             styles = styles_norm
     52 
---> 53         return self.generator(styles, noise, self.step, self.alpha, mixing_range=mixing_range)
     54 
     55     def mean_style(self, input):

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-14-1cf1d24ff3ee> in forward(self, style, noise, step, alpha, mixing_range)
     58             if i > 0 and step > 0:
     59                 upsample = F.interpolate(
---> 60                     out, scale_factor=2, mode='bilinear', align_corners=False
     61                 )
     62                 out = conv(upsample, style_step, noise[i])

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\functional.py in interpolate(input, size, scale_factor, mode, align_corners)
   2528         raise NotImplementedError("Got 4D input, but linear mode needs 3D input")
   2529     elif input.dim() == 4 and mode == 'bilinear':
-> 2530         return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
   2531     elif input.dim() == 4 and mode == 'trilinear':
   2532         raise NotImplementedError("Got 4D input, but trilinear mode needs 5D input")

RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 4.00 GiB total capacity; 2.18 GiB already allocated; 58.16 MiB free; 2.88 GiB reserved in total by PyTorch)

Any idea, how to fix or avert it. (Tried also with a Google Colab approach, but Colab tends to collapse due to another issue)?

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant