Updated 2023-07-28

Run Parallel on the Cluster

Overview

  • GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.
  • This guide will cover how to run Parallel on the Cluster.

Walkthrough: Run Parallel on the Cluster

  • This walkthrough will cover how to use seq to generate four lines of input, which is piped into parallel..
  • SBATCH Script can be found here
  • You can transfer the files to your account on the cluster to follow along. The file transfer guide may be helpful.

Part 1: The SBTACH Script

#!/bin/bash
#SBATCH -JparallelTest
#SBATCH -A [Account]
#SBATCH -N1 --ntasks-per-node=1
#SBATCH -t5
#SBATCH -qinferno
#SBATCH -oReport-%j.out

cd $SLURM_SUBMIT_DIR
module load parallel/20210922

seq 1 4 | parallel echo "Hello world {}!"
  • The #SBATCH directives are standard, requesting 10 minutes of walltime and 1 node with 1 cores. More on #SBATCH directives can be found in the Using Slurm on Phoenix Guide
  • $SLURM_SUBMIT_DIR is a variable that represents the directory you submit the SBATCH script from. Make sure the files you want to use are in the same directory you put the SBATCH script.
  • Output Files will also show up in this directory as well
  • module load parallel loads the default 20210922 version of Parallel. To see what Parallel versions are available, run module avail parallel, and load the one you want.
  • seq 1 4 | parallel echo "Hello world {}!" runs the parallel "Hello world!".

Part 2: Submit Job and Check Status

  • Make sure you're in the dir that contains the SBATCH Script.
  • Submit as normal, with sbatch < script name>. In this case sbatch parallel.sbatch
  • Check job status with squeue --job <jobID>, replacing with the jobid returned after running sbatch
  • You can delete the job with scancel <jobID> , replacing with the jobid returned after running sbatch

Part 3: Collecting Results

  • In the directory where you submitted the SBATCH script, you should see a Report-<jobID>.out file which contains the results of the job.
  • Report-<jobID>.out should look like this:
---------------------------------------
Begin Slurm Prolog: Mar-05-2023 19:39:57
Job ID:    885460
User ID:   svangala3
Account:   phx-pace-staff
Job name:  parallelTest
Partition: cpu-small
QOS:       inferno
---------------------------------------
Hello world 1!
Hello world 2!
Hello world 3!
Hello world 4!
---------------------------------------
Begin Slurm Epilog: Mar-05-2023 19:39:58
Job ID:        885460
Array Job ID:  _4294967294
User ID:       svangala3
Account:       phx-pace-staff
Job name:      parallelTest
Resources:     cpu=1,mem=1G,node=1
Rsrc Used:     cput=00:00:03,vmem=28K,walltime=00:00:03,mem=0,energy_used=0
Partition:     cpu-small
QOS:           inferno
Nodes:         atl1-1-01-004-19-1
---------------------------------------
  • After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
  • Congratulations! You successfully ran Parallel on the cluster.