In order to access the cluster, you will need to have a user account at the Max Planck Computing and Data Facility - MPCDF Registration. If you already have an MPCDF account and would like to access our HPC please mail us at bioinformatics@age.mpg.de.
Login to MPCDF Access Nodes
For security reasons, direct login to the HPC system is allowed only from within the MPCDF network. Users from other locations have to login to one of the gateway systems first.
From everywhere:
ssh <username>@gate.mpcdf.mpg.de
From the MPG network (when over at the institute or over VPN):
ssh <username>@raven.mpcdf.mpg.de
Where <username>
is your MPCDF login.
Once you have accessed one this systems you can access our HPC:
ssh hpc.bioinformatics.studio
SLURM (Simple Linux Utility for Resource Management) is an open-source workload manager and job scheduler used primarily in high-performance computing (HPC) environments. It manages and schedules tasks across a cluster of computers, optimizing resource utilization and maximizing throughput.
Jobs
A job is a unit of work submitted to SLURM for execution. It can consist of one or more tasks, where each task is a process or a thread running on a node.
Partitions
Partitions (also known as queues) are groups of nodes with similar characteristics, such as CPU type, memory, or GPU availability. Jobs can be submitted to specific partitions based on the requirements. There are two partitions available in hpc bioinformatics studio
: cluster
and dedicated
, where cluster
is the default partition.
Nodes
Nodes are individual computers in the cluster that execute jobs. Each node has its own set of resources, such as CPUs and memory.
Resources
SLURM manages and allocates resources for jobs based on user requirements. This includes CPU cores, memory, and other hardware resources.
sinfo
: Cluster Information
View partition information
sinfo
Show information of nodes
sinfo -N
sinfo -N -O partitionname,nodehost,cpus,cpusload,freemem,memory
Show information about nodes with a specific state (e.g., idle, alloc, mix, etc.)
sinfo -t <node_state>
sbatch
: Submitting Jobs
Submit jobs to SLURM
sbatch [options] script.sh
Where script.sh
is the shell script containing the commands you want to execute.
Common options:
-p <partition>
: Specify the partition/queue for the job.-n <tasks>
: Number of tasks in the job.--cpus-per-task=<cores>
: Specify the number of CPU cores per task.--mem=<memory>
: Request memory for the job.
Example with options
sbatch -p <partition> --cpus-per-task=<n cpus> --mem=<n>gb -t <hours>:<minutes>:<seconds> -o <stdout file> <script>
Submissions wihtout arguments specifications will result in -p cluster
and a time limit of 2 weeks.
Feel free to use all partitions (ie. cluster and dedicated partition) for large job submissions.
When submitin jobs with sbatch
you can also include SLURM parameters inside the script ie.:
#!/bin/bash
#SBATCH -p <partition>
#SBATCH --cpus-per-task=<n cpus>
#SBATCH --mem=<n>gb
#SBATCH -t <hours>:<minutes>:<seconds>
#SBATCH -o <stdout file>
<code>
and then run the script with sbatch <script>
.
Following is an example of managing large number of jobs to prevent overloading the cluster. It is strongly recommended to limit the number of simultaneously running jobs.
#!/bin/bash
cd ~/project/raw_data
for f in $(ls *.fastq);
do rm ~/project/slurm_logs/${f}.*.out
# wait if the running jobs exceed the limit
while [ `squeue -u <username> | wc -l` -gt "500" ];
do echo "sleeping"; sleep 300
done
sbatch --cpus-per-task=4 --mem=8gb --time=5-24 \
-p cluster -o ~/project/slurm_logs/${f}.%j.out ~/project/tmp/${f}.sh << EOF
#!/bin/bash
# necessary operations
EOF
done
exit
Dependencies and Job Chains: e.g. submit a job3 after job1 and job2 are successfully ready
job1=$(sbatch --parsable <script1>)
job2=$(sbatch --parsable <script2>)
sbatch -d afterok:${job1}:${job2} <script3>
srun
: Launch Task and Interactive Session
Start an interactive bash session
srun --pty bash
Attach to a running job and run a command
srun --jobid <job_id> --pty <command>
squeue
: Checking Queue Status
Show the queue information
squeue
Queue infromation with option
squeue [options]
Common options:
-u <username>
: Show jobs for a specific user.-p <partition>
: Show jobs in a specific partition.
scontrol
: Controlling Jobs
Show detailed information about a job
scontrol show job <job_id>
Show information about a partition
scontrol show partition <parition_name>
Show information about a node
scontrol show node <node_name>
Hold a job
scontrol hold <job_id>
Release a job
scontrol release <job_id>
Change the partitions preference of a pending job
scontrol update job <job id> partition=<partition1>,<partition2>,<partition3>
Change the partition and reserved specific nodes
scontrol update job <job id> partition=<partition1>,<partition2>,<partition3> nodelist=<node list>
scancel
: Cancelling Jobs
Cancel a job
scancel <job_id>
It is important to assign right amount of resources (e.g. number of CPUs:--cpus-per-task
, memory:--mem
) for a job.
Over assignments of resources occupies unnecessary space or computational power and it may prevent running new jobs or lead to higher waiting times for other users.
In order to have an idea about resource usage by a job, run htop
command via a running job:
srun --jobid <job_id> --pty htop --user=$USER
From the outupt, CPU%
, MEM%
of the running processes/jobs will provide you an overall idea about the resource usage, which can be helpful for better assignment of resources.
Also, you can get simple but useful job stats using reportseff
command:
reportseff <job_id>
In order to get all the recent job stats run by your user:
reportseff -u $USER
Singularity/Apptainer (also known as Apptainer) enables container images for HPC. In a nutshell, Singularity allows end-users to efficiently and safely run a docker image in an HPC system.
An introduction to docker and how to generate your own images can be found here. More information on how to build docker images and best practices for writing Dockerfiles can be found here and here, respectively.
Singularity/Apptainer is directly available on our HPC (no neeed to load any module).
Pull a docker image and convert to a Singularity image. The resulting Singularity image will contain the software and environment defined by the Docker image.
singularity pull bioinformatics_software.v4.0.2.sif docker://index.docker.io/mpgagebioinformatics/bioinformatics_software:v4.0.2
Launch an interactive Bash shell within the context of the specified Singularity image
singularity exec bioinformatics_software.v4.0.2.sif /bin/bash
Detailed documentation for Apptainer is available here.
Running a script with singularity using the mpgagebioinformatics/bioinformatics_software
image:
- example script:
test.sh
#!/bin/bash
source ~/.bashrc
module load rlang python
which python
python << EOF
print "This is python"
EOF
which R
Rscript -e "print('This is R')"
- running the script
chmod +x test.sh
singularity exec bioinf.sif ./test.sh
/modules/software/python/2.7.15/bin/python This is python /modules/software/rlang/3.5.1/bin/R [1] "This is R"
Running a script over singularity on slurm:
- example script:
test.slurm.sh
#!/bin/bash
#SBATCH --cpus-per-task=18
#SBATCH --mem=15gb
#SBATCH --time=5-24
#SBATCH -p cluster
#SBATCH -o test.singularity.out
singularity exec bioinf.sif /bin/bash << SHI
#!/bin/bash
source ~/.bashrc
module load rlang python
which python
python << EOF
print "This is python"
EOF
which R
Rscript -e "print('This is R')"
SHI
- running the script over slurm
chmod +x test.slurm.sh
sbatch test.slurm.sh
Submitted batch job 4684802
- check the output
cat test.singularity.out
> /modules/software/python/2.7.15/bin/python
> This is python
>
> /modules/software/rlang/3.5.1/bin/R
> [1] "This is R"
Example for automation over a batch of jobs:
- example script:
automation.slurm.singularity.sh
#!/bin/bash
cd ~/project/raw_data
for f in $(ls *.fastq);
do rm ~/project/slurm_logs/${f}.*.out
sbatch --cpus-per-task=18 --mem=15gb --time=5-24 \
-p dedicated -o ~/project/slurm_logs/${f}.%j.out <<EOF
#!/bin/bash
singularity exec bioinf.sif /bin/bash << SHI
#!/bin/bash
source ~/.bashrc
module load bwa
cd ~/project/raw_data
bwa mem –T 18 ${f}
SHI
EOF
done
exit
- running the script
chmod +x automation.slurm.singularity.sh
./automation.slurm.singularity.sh
Our mpgagebioinformatics/bioinformatics_software
make use of the modules system to load and unload required software.
The Environment Modules Project is a centralized software system. The modules system loads software (version of choice) and changes environment variables (eg. LD_LIBRARY_PATH).
# show available modules
module avail
# show a description of the samtools module
module whatis samtools
# show environment changes for samtools
module show samtools
# load samtools
module load samtools
# list all loaded modules
module list
# unload the samtools module
module unload samtools
# unload all loaded modules
module purge
Data can be accessed from our hpc.bioinformatics.studio
as well as from raven.mpcdf.mpg.de
:
cd /nexus/posix0/MAGE-flaski/service/hpc/
There you will find a home/<username>
folder to store your data and a group/<group name>
to share data within your group.
From raven.mpcdf.mpg.de
you will be able to transfer your data to /raven/ptmp
(also efemeral) eg.:
rsync -rtvh /nexus/posix0/MAGE-flaski/service/hpc/group/<group>/<folder> /raven/ptmp/<user>/
or to the archive file system:
rsync -rtvh /nexus/posix0/MAGE-flaski/service/hpc/group/<group>/<folder> /r/<username first letter>/<user>/
For more information on the /raven/ptmp
and archive /r
please consult MPCDF's documentation.
Data stored on the cluster is not backed up. Files will be deleted parmanently after being unused for 3 months. You are responsible for the backup of your data into a different file system.
The clients needs to be able to use scp
, which means secure copy, pointing that your files are crypted during the copy process. We would recommend using filezilla. This Gui is available most known Operating systems, like Windows, Linux or OS-X (if you have a MAC). Alternatively you can use rsync
or modern file transfer tools that make use of the same file transfer protocols.
# on the server side
scp </path/to/file> <username>@<address>:~/path/to/file
# on your client side
scp <username>@<address>:</path/to/file> ~/path/to/file
Posit is a data science developer platform that provides access to multiple development environment including Rstudio, Jupyter Notebook, JupyterLab and VS Code. With Posit Bioinformatics Studio, you can run multiple-concurrent sessions as well as use different versions of R and Python.
It is available at https://docker.bioinformatics.studio/posit and can be accessed through the institute networks (e.g. internal wifi, lan, vpn, MPCDF gateways). In order to access the service, you need to have a user account at the Max Planck Computing and Data Facility - MPCDF Registration. If you already have an MPCDF account and would like to use Posit Bioinformatics Studio, please mail us at bioinformatics@age.mpg.de.
Login to the platform is possible with your MPCDF user credentials.
Your home directory would be in /nexus/posix0/MAGE-flaski/service/posit/<user>
. However, when starting a new VS Code or Jupyter session terminal, you may find the current location as the primary home of /home/<user>
. A simple cd
command would take you to the home directory in nexus space.
Additionally, hpc.bioinformatics.studio
home or group directory can also be accessed from the platform. To learn more about managing data, please look into the Data section.
From the command line (eg. JupyterLab terminal or VScode terminal):
$ source /python.rc
Available python versions:
3.10.8 3.11.0 3.8.10 3.9.5 jupyter
use 'source /python.rc <version>' to load the respective environment.
$ source /python.rc 3.10.8
$ pip3 install pandas --user
$ pip3 show pandas
Name: pandas
Version: 1.5.1
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: The Pandas Development Team
Author-email: pandas-dev@python.org
License: BSD-3-Clause
Location: /nexus/posix0/MAGE-flaski/service/posit/home/jboucas/.jupyter/python/3.10/lib/python3.10/site-packages
Requires: numpy, python-dateutil, pytz
Required-by: AGEpy, formulaic, lifelines, seaborn, statsmodels
Users should run update.packages()
upon first R login.
To change R version and set appropriate env variables from terminal, /r.rc
can be used.
Check the available R versions
source /r.rc
Set R version
source /r.rc <version>
For further remote visualization services please consult MPCDF's documentation and previous training slides.
In order run visual tools such as posit/rstudio
or jupyterlab
on hpc.bioinformatics.studio
(possibly with custom parameters/resources) via a slurm job and access it from your local browser, please look into visual_tools.
Also, it is possible to run rstudio in your local device using your preferred rocker rstudio images.
For any query and support, please do not hesitate to get in touch through: bioinformatics@age.mpg.de