Useful Commands for HPC (NYU Shanghai, Pudong Campus)


The official documentation: https://github.com/michael-qi/nyush_hpc


Creating Virtual Environment


## load anaconda
$ module load anaconda3/5.2.0
## create virtual environment
$ conda create -n myEnv python=3.7 anaconda
## activate
$ source activate myEnv
## install packages
$ conda install -n myEnv [package]

Create the file runScript.sh:

#!/bin/bash

#SBATCH --time=120:00:00
#SBATCH --job-name=JobName
#SBATCH --output=slurm_%j.out

module purge
module load anaconda3/5.2.0
module load python/gnu/3.7.3
source activate myEnv
python myScript.py 

Submit using:

$ sbatch runScript.sh

Deactivate and delete a no longer needed virtual environment:

$ source deactivate
$ conda remove -n myEnv -all

Array Jobs with Python

Running the same code with different parameters as inputs: Suppose the parameter space is 3-dimensional and each dimension has 2 possible values. We need to run 8 scripts in parallel.

Create the python file myScript.py:

import sys
import numpy as np
    
Task_ID = int(sys.argv[1])

# this is equivalent to Matlab's ind2sub
ind = np.unravel_index(Task_ID-1, [2, 2, 2], 'F')
ind_x = ind[0]
ind_y = ind[1]
ind_z = ind[2]

# some code to assign parameter values to different ind_x, ind_y, ind_z values
# also do the computation


# collect output
with open("result.txt", "a+") as text_file:
    text_file.write("x: %s, y: %s, z: %s, result: %s \n" %(x, y, z, result))


In the bash file:

#!/bin/bash

#SBATCH --time=120:00:00
#SBATCH --job-name=JobName
#SBATCH --output=slurm_%j.out
#SBATCH --array=1-8

python myScript.py $SLURM_ARRAY_TASK_ID

Using Jupyter Lab

First create a virtual enviroment (with installed libraries). Then create the bash file jupyter.sh (request 1 GPU):


#!/bin/bash

#SBATCH --partition=aquila
#SBATCH --nodelist=agpu1
#SBATCH --gres=gpu:1
#SBATCH --job-name jupyter
#SBATCH --output jupyter-log.txt
#SBATCH --time=120:00:00
#SBATCH --mem=80GB

module purge
module load python/gnu/3.7.3
module load anaconda3/5.2.0
source activate myEnv


XDG_RUNTIME_DIR=""
ipnport=$(shuf -i8000-9999 -n1)
ipnip=$(hostname -i)

echo -e "\n"
echo    "  Paste ssh command in a terminal on local host (i.e., laptop)"
echo    "  ------------------------------------------------------------"
echo -e "  ssh -N -L $ipnport:$ipnip:$ipnport $USER@hpc.shanghai.nyu.edu\n"
echo    "  Open this address in a browser on local host; see token below"
echo    "  ------------------------------------------------------------"
echo -e "  localhost:$ipnport                                      \n\n"

jupyter-lab --no-browser --port=$ipnport --ip=$ipnip

and submit using

$ sbatch jupyter.sh
$ cat jupyter-log.txt

Then open another ssh terminal and paste the line in jupyter-log.txt that is similar to:

$ ssh -N -L $ipnport:$ipnip:$ipnport $USER@hpc.shanghai.nyu.edu

Finally, open a browser with the following address:

localhost:$ipnport

Installing R packages

In the shell:


$ module load R/gnu/3.6.3
$ R
> install.packages("myPackage")

and answer "yes".


back to homepage