Updated 2023-07-28
Run Parallel on the Cluster¶
Overview¶
- GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.
- This guide will cover how to run Parallel on the Cluster.
Walkthrough: Run Parallel on the Cluster¶
- This walkthrough will cover how to use seq to generate four lines of input, which is piped into parallel..
SBATCH
Script can be found here- You can transfer the files to your account on the cluster to follow along. The file transfer guide may be helpful.
Part 1: The SBTACH Script¶
#!/bin/bash
#SBATCH -JparallelTest
#SBATCH -A [Account]
#SBATCH -N1 --ntasks-per-node=1
#SBATCH -t5
#SBATCH -qinferno
#SBATCH -oReport-%j.out
cd $SLURM_SUBMIT_DIR
module load parallel/20210922
seq 1 4 | parallel echo "Hello world {}!"
- The
#SBATCH
directives are standard, requesting 10 minutes of walltime and 1 node with 1 cores. More on#SBATCH
directives can be found in the Using Slurm on Phoenix Guide $SLURM_SUBMIT_DIR
is a variable that represents the directory you submit the SBATCH script from. Make sure the files you want to use are in the same directory you put the SBATCH script.- Output Files will also show up in this directory as well
module load parallel
loads the default 20210922 version of Parallel. To see what Parallel versions are available, runmodule avail parallel
, and load the one you want.seq 1 4 | parallel echo "Hello world {}!"
runs the parallel "Hello world!".
Part 2: Submit Job and Check Status¶
- Make sure you're in the dir that contains the
SBATCH
Script. - Submit as normal, with
sbatch < script name>
. In this casesbatch parallel.sbatch
- Check job status with
squeue --job <jobID>
, replacing with the jobid returned after running sbatch - You can delete the job with
scancel <jobID>
, replacing with the jobid returned after running sbatch
Part 3: Collecting Results¶
- In the directory where you submitted the
SBATCH
script, you should see aReport-<jobID>.out
file which contains the results of the job. Report-<jobID>.out
should look like this:
---------------------------------------
Begin Slurm Prolog: Mar-05-2023 19:39:57
Job ID: 885460
User ID: svangala3
Account: phx-pace-staff
Job name: parallelTest
Partition: cpu-small
QOS: inferno
---------------------------------------
Hello world 1!
Hello world 2!
Hello world 3!
Hello world 4!
---------------------------------------
Begin Slurm Epilog: Mar-05-2023 19:39:58
Job ID: 885460
Array Job ID: _4294967294
User ID: svangala3
Account: phx-pace-staff
Job name: parallelTest
Resources: cpu=1,mem=1G,node=1
Rsrc Used: cput=00:00:03,vmem=28K,walltime=00:00:03,mem=0,energy_used=0
Partition: cpu-small
QOS: inferno
Nodes: atl1-1-01-004-19-1
---------------------------------------
- After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
- Congratulations! You successfully ran Parallel on the cluster.