Skip to content

Latest commit

 

History

History
executable file
·
51 lines (39 loc) · 3.11 KB

containers.md

File metadata and controls

executable file
·
51 lines (39 loc) · 3.11 KB

Containers introduction

Containers can be used to bundle software dependencies. This means that analyses can be made more reproducible, installation of tools much easier and therefore that the same workflow or analysis can be run across multiple different compute infrastructures in a very portable manner. For example, this workflow can be run on HPC (using JAX's Sumner HPC) or on the cloud over (Lifebit's CloudOS platform with AWS & GCloud). This is thanks to containers and also due to the workflow manager Nextflow which has in-built support containers such as Docker and Singularity.

Resources

  • An introduction to containers (as well as Nextflow & CloudOS) can be found here
  • Instructions on installing Docker and Singularity can be found here ⚠️ You will need root permissions to install Docker ⚠️
  • You can also read this this guide (4mins read) for a more high level overview of Docker containers.

If your still confused after reading it don't hesitate to ping @PhilPalmer or @adeslatt.

Important note

One important note is that containers would ideally be:

  1. Docker containers
    • This is because Nextflow can convert Docker -> Singulairty containers but not the other way around
  2. Hosted in a google container registry (gcr)
    • This is because to run on Google Cloud (eg on CloudOS) containers will take too long to be fetch and cause the pipeline to fail if they are not hosted here

Doing both of these things makes containers as portable as possible. However, both also have issues because Docker requires root or admin acess (which is not available on Sumner) & you must have access to a gcr to be able to push containers there.

To build a Docker container

If you need to modify one of the containers, eg to update or add more software dependencies you can do so like so:

docker build -t <registry_user>/<image_name>:<tag> .

Eg:

cd containers/splicing-pipelines-nf/
docker build -t gcr.io/nextflow-250616/splicing-pipelines-nf:gawk .

To push your image to a container register

docker push <registry_user>/<image_name>:<tag>

Eg to GCR

⚠️ You will need credentials/access for a google container registry ⚠️

gcloud auth login
docker push gcr.io/<project_id>/<image_name>:<tag>

Troubleshooting Singularity Images on Sumner

On Sumner, you may need to remove old singularity images from the cache dir in order to implement updated/new image:

  • Having a cache dir is a Nextflow thing. The idea of this is to save the images to prevent needing to pull the images on each execution which would be really slow
  • You should only need to clear the cache as when the containers are updated.
  • We ended up setting the cacheDir to cacheDir = "/projects/anczukow-lab/.singularity_cache/"