Skip to content

This is a step by step guide to setup pytorch workplace for GMU cloud cluster

Notifications You must be signed in to change notification settings

phananh1010/gmu-hopper-python-installation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 

Repository files navigation

gmu-hopper-docker

This is a step by step guide to setup pytorch workplace for GMU cloud cluster. Note: in this setup, you DO NOT need root permission to install almost every python-based program. If you want to use Docker, refer to step 5 in file DOCKER_TUTORIAL.md

Step 0: request and login to a worker

You must install conda from a worker. Do not install on head node. The head node is the machine you are connected to right after logging in. Use the following command to request a worker.

salloc -p gpuq -q gpu --gres=gpu:A100.40gb:1 -n 1  --mem=15G -t 0-12:00:00

The command above assume you are a normal user. To request resource as a contributor, use this command instead:

salloc --partition=contrib-gpuq --qos=ksun --gres=gpu:A100.80gb:1 --nodes=1 --ntasks-per-node=12 --mem=48G  -t 2-12:00:00

Step 1: download and install anaconda

Download the sh file:

wget https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh

Install on Anaconda on scratch directory, located at /scratch/[uid]. Note that you must install everything on scratch while logging in the requested computing machine. Do not install anaconda from the head node. The head node may use another OS and software tailored for head node OS may not work on request machine. To install anaconda, create a directory on scratch, e.g. mkdir /scratch/\[uid\]/anaconda3.

Step 2: create virtual environment on scratch

Use --prefix to specify path to scratch: conda create --prefix=/scratch/\[uid\]/env_name

Step 3: activate environment variable

source /scratch/\[uid\]/anaconda3/etc/profile.d/conda.sh

You must invoke this command so that conda command can be registered.

Step 4: install your packages.

Activate your environment, then install your packages as usual

conda activate /scratch/anguy59/env_hmd_attack

Ensure that you are using your environment installed inside scratch. invoke ipython interface, and execute these code:

import sys
sys.exec_prefix

Step 5: profit

Fix anaconda error, corrupted yaml package: "AttributeError: module 'ruamel_yaml' has no attribute 'representer'"

Step 1: install pip

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
./python3 get-pip.py --force-reinstall

Step 2: force re-install yaml

 ./pip3 install --upgrade ruamel.yaml --ignore-installed ruamel.yaml

Step3: If still doesn't work, remove anaconda using rm -rf, and reinstall conda from scratch

About

This is a step by step guide to setup pytorch workplace for GMU cloud cluster

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published