Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

Closed
RudyVenguswamy opened this issue Nov 10, 2020 · 1 comment · Fixed by #329
Closed

Batch Size & CUDA Out of Memory Error for Custom Dataset - SimCLR #355

RudyVenguswamy opened this issue Nov 10, 2020 · 1 comment · Fixed by #329
Labels
help wanted Extra attention is needed
Milestone

Comments

@RudyVenguswamy
Copy link

Context: We’re a team trying to build a reverse image searcher for satellite data. Currently, we’re using the UC Merced land use dataset (linked in notebook below) to train a self-supervised learner and evaluate it using the labels provided with the dataset.

Issue: Batch sizes when training SimCLR cannot be above 16 without causing a CUDA OOM error. This occurs on Colab T4 GPU and locally on nvidia GeForce RTX 2080. When we manually save the model using trainer.save_checkpoint(myPath), the size of the output file is ~328mb. We use a custom dataset of images and instantiate a dataset class. Does the issue stem from our dataloading? With a finetuner SSL classifier added on top (total ~1gb model) , this issue persists and when loaded into CUDA memory, exacerbates the issue with batch size. Does us using a small batch size affect the lars_optimizer’s performance potentially as well?

Code to Reproduce (Colab): https://colab.research.google.com/drive/1b0xukK7RIw6VxrEJb23f91xaMWxBTbs6?usp=sharing

@RudyVenguswamy RudyVenguswamy added the help wanted Extra attention is needed label Nov 10, 2020
@ananyahjha93
Copy link
Contributor

@RudyVenguswamy this looks like a bug to me, let me check on this.

@ananyahjha93 ananyahjha93 mentioned this issue Nov 16, 2020
8 tasks
@Borda Borda added this to the v0.3 milestone Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants