Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

RudyVenguswamy · 2020-11-10T21:38:54Z

Context: We’re a team trying to build a reverse image searcher for satellite data. Currently, we’re using the UC Merced land use dataset (linked in notebook below) to train a self-supervised learner and evaluate it using the labels provided with the dataset.

Issue: Batch sizes when training SimCLR cannot be above 16 without causing a CUDA OOM error. This occurs on Colab T4 GPU and locally on nvidia GeForce RTX 2080. When we manually save the model using trainer.save_checkpoint(myPath), the size of the output file is ~328mb. We use a custom dataset of images and instantiate a dataset class. Does the issue stem from our dataloading? With a finetuner SSL classifier added on top (total ~1gb model) , this issue persists and when loaded into CUDA memory, exacerbates the issue with batch size. Does us using a small batch size affect the lars_optimizer’s performance potentially as well?

Code to Reproduce (Colab): https://colab.research.google.com/drive/1b0xukK7RIw6VxrEJb23f91xaMWxBTbs6?usp=sharing

ananyahjha93 · 2020-11-11T23:28:54Z

@RudyVenguswamy this looks like a bug to me, let me check on this.

RudyVenguswamy added the help wanted Extra attention is needed label Nov 10, 2020

ananyahjha93 mentioned this issue Nov 16, 2020

simclr fixes #329

Merged

8 tasks

ananyahjha93 closed this as completed in #329 Nov 17, 2020

Borda added this to the v0.3 milestone Jan 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

RudyVenguswamy commented Nov 10, 2020

ananyahjha93 commented Nov 11, 2020

Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

Comments

RudyVenguswamy commented Nov 10, 2020

ananyahjha93 commented Nov 11, 2020