You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context: We’re a team trying to build a reverse image searcher for satellite data. Currently, we’re using the UC Merced land use dataset (linked in notebook below) to train a self-supervised learner and evaluate it using the labels provided with the dataset.
Issue: Batch sizes when training SimCLR cannot be above 16 without causing a CUDA OOM error. This occurs on Colab T4 GPU and locally on nvidia GeForce RTX 2080. When we manually save the model using trainer.save_checkpoint(myPath), the size of the output file is ~328mb. We use a custom dataset of images and instantiate a dataset class. Does the issue stem from our dataloading? With a finetuner SSL classifier added on top (total ~1gb model) , this issue persists and when loaded into CUDA memory, exacerbates the issue with batch size. Does us using a small batch size affect the lars_optimizer’s performance potentially as well?
Context: We’re a team trying to build a reverse image searcher for satellite data. Currently, we’re using the UC Merced land use dataset (linked in notebook below) to train a self-supervised learner and evaluate it using the labels provided with the dataset.
Issue: Batch sizes when training SimCLR cannot be above 16 without causing a CUDA OOM error. This occurs on Colab T4 GPU and locally on nvidia GeForce RTX 2080. When we manually save the model using trainer.save_checkpoint(myPath), the size of the output file is ~328mb. We use a custom dataset of images and instantiate a dataset class. Does the issue stem from our dataloading? With a finetuner SSL classifier added on top (total ~1gb model) , this issue persists and when loaded into CUDA memory, exacerbates the issue with batch size. Does us using a small batch size affect the lars_optimizer’s performance potentially as well?
Code to Reproduce (Colab): https://colab.research.google.com/drive/1b0xukK7RIw6VxrEJb23f91xaMWxBTbs6?usp=sharing
The text was updated successfully, but these errors were encountered: