RuntimeError: CUDA out of memory #1

tolandwehr · 2020-04-30T21:37:53Z

Hi there,

I tried adapting your StyleGAN approach (quite unchanged) for my issue (and a smaller test set) and it might work... but it's unfortunately hard to tell, as the process is interrupted at the 32-pixel-level by a "CUDA out of memory". I tried with the suggested solutions here , including the 1/0 approach, none of them with success. Also tried for raising the alpha_batch (and thus reducing the batch_size): No success.

Here is the complete error response:

RuntimeError                              Traceback (most recent call last)
<ipython-input-34-a0dbf2349b14> in <module>
----> 1 learn.fit(3, [1e-4, 1e-3])

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\basic_train.py in fit(self, epochs, lr, wd, callbacks)
    198         else: self.opt.lr,self.opt.wd = lr,wd
    199         callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
--> 200         fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
    201 
    202     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\basic_train.py in fit(epochs, learn, callbacks, metrics)
     99             for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
    100                 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 101                 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
    102                 if cb_handler.on_batch_end(loss): break
    103 

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     28 
     29     if not loss_func: return to_detach(out), to_detach(yb[0])
---> 30     loss = loss_func(out, *yb)
     31 
     32     if opt is not None:

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~\Anaconda3\envs\Tensorflow\lib\site-packages\fastai\vision\gan.py in forward(self, *args)
     46 
     47     def forward(self, *args):
---> 48         return self.generator(*args) if self.gen_mode else self.critic(*args)
     49 
     50     def switch(self, gen_mode:bool=None):

<ipython-input-19-28b775f866bd> in critic(self, real_pred, input)
     13     def critic(self, real_pred, input):
     14         "Create some `fake_pred` with the generator from `input` and compare them to `real_pred` in `self.loss_funcD`."
---> 15         fake = self.gan_model.generator(input.requires_grad_(False)).requires_grad_(True)
     16         fake_pred = self.gan_model.critic(fake, actual=True)
     17 

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-16-d156c49088a2> in forward(self, input, noise, mean_style, style_weight, mixing_range, mixing)
     51             styles = styles_norm
     52 
---> 53         return self.generator(styles, noise, self.step, self.alpha, mixing_range=mixing_range)
     54 
     55     def mean_style(self, input):

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-14-1cf1d24ff3ee> in forward(self, style, noise, step, alpha, mixing_range)
     58             if i > 0 and step > 0:
     59                 upsample = F.interpolate(
---> 60                     out, scale_factor=2, mode='bilinear', align_corners=False
     61                 )
     62                 out = conv(upsample, style_step, noise[i])

~\Anaconda3\envs\Tensorflow\lib\site-packages\torch\nn\functional.py in interpolate(input, size, scale_factor, mode, align_corners)
   2528         raise NotImplementedError("Got 4D input, but linear mode needs 3D input")
   2529     elif input.dim() == 4 and mode == 'bilinear':
-> 2530         return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
   2531     elif input.dim() == 4 and mode == 'trilinear':
   2532         raise NotImplementedError("Got 4D input, but trilinear mode needs 5D input")

RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 4.00 GiB total capacity; 2.18 GiB already allocated; 58.16 MiB free; 2.88 GiB reserved in total by PyTorch)

Any idea, how to fix or avert it. (Tried also with a Google Colab approach, but Colab tends to collapse due to another issue)?

Thanks in advance!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA out of memory #1

RuntimeError: CUDA out of memory #1

tolandwehr commented Apr 30, 2020

RuntimeError: CUDA out of memory #1

RuntimeError: CUDA out of memory #1

Comments

tolandwehr commented Apr 30, 2020