Sherlock cluster at Stanford University
Sherlock cluster at Stanford University
The basis of scheduling jobs using the Sherlock cluster (these instructions probably work for other clusters, like FarmShare)
We are not currently paying for priority access to Sherlock, and so we will get booted from nodes if paying members opt to use their nodes. For this reason, please try to make your jobs persistent: they continously write their output to disk, and can also re-start if stopped. So if you have a giant ‘for’ loop of image processing, try to write each processed image to disk when you’re done with it.
If you need to keep track of the values of a bunch of variables, try writing the entire execution state of your program to disk every time the loop is called (as long as this is relatively inexpensive compared to the time it takes to produce the new state in each step of the loop). So if you’re doing a massing serial numerical integration, at the end of each timestep consider writing the timestep number and the current values of all of your independent and dependent variables to disk. You have 15GB quota (40 TB if you remember to change directories to $SCRATCH
), so write everything that you can to disk as often as you can without affecting performance.
Getting started on Sherlock
Authenticate your computer using Kerberos. You usually don’t need to also use the VPN, even if you are off campus.
$ kinit username@stanford.edu
$ ssh username@sherlock.stanford.edu
To return to your home directory
$ cd $HOME
Check your disk usage (for home directory, 15 GB max)
$ df -h $HOME
If needed, switch to the file system for large amounts of data (40 TB per user)
$ cd $SCRATCH
Check your SCRATCH usage
$ lfs quota -u $(id -un) /scratch
Useful shortcuts
Copy a file from local over to $HOME
$ scp cfgen.py wgilpin@sherlock.stanford.edu:home
Copy a file over to local
scp your_username@remotehost.edu:my_output_fils/*results.txt /some/local/directory
Example: A MATLAB job
Write as much as you can in a MATLAB script “matlabtest.m” that can just be run directly without further user input.
Now write a batch script that calls the MATLAB file. We’ll use the terminal-based text editor emacs
$ emacs my_batch_script.sh
You’re in the emacs editor window, here’s an example script
#!/bin/bash
matlab < matlabtest.m
If you need to change directory, etc, here is the place to use cd, ls, pwd, etc.
Now use emacs to write a batch script “tinytest.sbatch” that allocates resources and submits your script
#!/bin/bash
#
#all commands that start with SBATCH contain commands that are just used by SLURM for scheduling
#################
#set a job name
#SBATCH --job-name=tinytest
#################
#a file for job output, you can check job progress
#SBATCH --output=tinytest.out
#################
# a file for errors from the job
#SBATCH --error=tinytest.err
#################
#time you think you need; default is one hour
#in minutes in this case
#SBATCH --time=1:00
#################
#quality of service; think of it as job priority
#SBATCH --qos=normal
#################
#number of nodes you are requesting
#SBATCH --nodes=1
#################
#memory per node; default is 4000 MB per CPU
#SBATCH --mem=1000
#you could use --mem-per-cpu; they mean what we are calling cores
#################
#tasks to run per node; a "task" is usually mapped to a MPI processes.
# for local parallelism (OpenMP or threads), use "--ntasks-per-node=1 --cpus-per-tasks=16" instead
#SBATCH --ntasks-per-node=1
#################
#now run normal batch commands
module load openmpi/1.6.5/intel13sp1up1
#you have to actually load matlab first
module load matlab
#run Intel MPI Benchmarks with mpirun
srun /usr/mpi/intel-13/openmpi-1.6.5-1/tests/IMB-3.2.4/IMB-MPI1
#now call your bash file
bash my_batch_script.sh
If my_batch_script.sh
is particularly simple you can just put the contents of it directly into the .sbatch
file, in order to reduce the number of files and dependencies floating around.
Now submit your job
sbatch tinytest.sbatch
Output should be written to tinytest.out, and any warnings thrown by the compiler should go to tinytest.err
Check on your job
squeue -u <netid>
Notes
- Look at the Sherlock wiki to see how to use the big data nodes for working on large numbers of image files
- Use standard terminal commands like scp, etc to move files back and forth between your computer and the cluster
Python jobs
This works much the same as the way to submit MATLAB jobs, however you need to enter any virtualenvs and load the correct Python version in wrapper.sh
. The full steps are here:
Put your entire Python script into a single file
integrator.py
Load the correct Python version, enter a virtualenv and install all of the necessary packages
module load python/2.7.5
source py2/bin/activate
Now write a batch script that enters the virtualenv and then runs the script. We’ll use the terminal-based text editor emacs
$ emacs my_batch_script.sh
You’re in the emacs editor window, here’s an example script
#!/bin/bash
source venv2/bin/activate
python < integrator.py
Now make some sort of sbatch file like the one above that ends with
...
bash my_batch_script.sh
As before, if my_batch_script.sh
is particularly simple you can just put the contents of it directly into the .sbatch
file, in order to reduce the number of files and dependencies floating around.
Sandboxing (recommended)
- For Python 3, use the built-in pyenv scripts
- For Python 2.7, use the package virtualenv
Advanced use, shortcuts, and tricks
To test your code or make small modifications, switch to development node
sdev
This will allocate 1 node for 1 hour.
If large amounts of data will be generated or read by your script, in wrapper.sh
add the line
cd $SCRATCH
This will run the job on a 40TB scratch disk (not backed up). To access the output of your job, switch to this disk and follow the same relative paths as before. Another option is $PI_HOME
and PI_SCRATCH
. For data that doesn’t need to exist longer than the life of the job, use the node’s own hard disk: $LOCAL_SCRATCH
(only ~80 GB)
Look up information about a currently running job
scontrol show job <jobid>
For a little more detail,
scontrol show jobid -dd <jobid>
Cancel your job
scancel <jobid>
Cancel all your jobs
scancel -u <netid>
Look at your FairShare usage
sshare -l
Generally you will have trouble scheduling jobs if your score falls below a 0.5. It recovers and increases over time.
Useful links
http://www.brightcomputing.com/Blog/bid/174099/Slurm-101-Basic-Slurm-Usage-for-Linux-Clusters