Updated 2022-10-06
Convert PBS Scripts to Slurm Scripts¶
Basic Info¶
SLURM is a resource manager with scheduling logic integrated into it. In comparison to Moab/Torque, SLURM eliminates the need for dedicated queues. In addition to allocation of resources at the job level, jobs spawn steps (srun instances), which are further allocated resources from within the job's allocation. The job steps can therefore execute sequentially or concurrently.
Visit our Slurm on Phoenix guide for more information about using Slurm. To learn more about the migration to Slurm, visit our Phoenix Slurm Migration page.
Preparation Steps¶
- Be sure to recompile software you have written or installed, particularly if it uses MPI. The Slurm cluster contains updated libraries.
- Update
module load
commands to the current software offerings on Phoenix. - Look up your charge account with
pace-quota
.
How is Slurm usage different from Torque/Moab?¶
- What Moab called queues, Slurm calls partitions. On Phoenix, the partitions are automatically assigned based on your account, resources requested, and the Quality of Service (QOS) option (inferno or embers). You will not be able to specify the partition.
- Resources are assigned per task/process. One core is given per task by default.
- Environment variables from the submitting process are passed to the job by default. Use
--export=NONE
to have a clean environment when running jobs. The default means that variables like$HOSTNAME
will be cloned from the login node when jobs are submitted from it. - First line of job script in Slurm must be
#!<shell>
, see conversion examples section below. - Slurm jobs start in the submission directory rather than
$HOME
- Slurm jobs have stdout and stderr output log files combined by default. For writing to a separate file, user can provide
--error
or-e
option. In Moab, stdout and stderr would go to different files by default, and they were merged with-j oe
option. - Slurm can send email when your job reaches certain percentage of walltime limit. Ex:
sbatch --mail-type=TIME_LIMIT_90 myjob.txt
- The default memory request on Slurm is 1 GB/core. To request all the memory on a node, include
--mem=0
. - Requesting a number of nodes or cores is structured differently. To request an exact number of nodes, use
-N
. To request an exact number of cores per node, use--ntasks-per-node
. To request a total number of cores, use-n
or--ntasks
. - The commands that you use to submit and manage jobs on the cluster are different for SLURM than they were for Moab. To submit jobs, you will now use
sbatch
andsrun
commands. To check the status, you will most commonly usesqueue
command. - Arrays are given
SLURM_ARRAY_JOB_ID
for the parent job and each child job gets aSLURM_JOB_ID
. Moab would assign the samePBS_JOBID
to each job with a different index. For more options and guideline how to use arrays in SLURM, please visit the Array Jobs section in the following page. - To include environment variables for naming output files in SLURM, you need to use file patters as follows: Job name
%x
, Job id%j
, Job array id%a
, Username%u
, Hostname%N
. srun
is the standard SLURM command to start an MPI program. It automatically uses the allocated job resources: nodelist, tasks, logical cores per task. Do not usempirun
.
Warning
Do not use mpirun
or mpiexec
with Slurm. Use srun
instead.
Cheat Sheet¶
This table lists the most common commands, environment variables, and job specification options used by the major workload management systems. Users can refer to this cheat sheet for converting their PBS scripts to SLURM scripts and user commands. A full list of SLURM commands can be found here. Further guidelines on more advanced scripts are in the user documentation on this page.
Conversion Examples¶
The following PBS script commands can be rewritten as a SLURM script below.
Specification | PBS | SLURM |
---|---|---|
Shell | #!/bin/bash (optional) |
#!/bin/bash |
Job Name | #PBS -N jobname |
#SBATCH -Jjobname |
Account name | #PBS -A accountname |
#SBATCH -A accountname (required on Phoenix-Slurm) |
Job Resources | PBS -l nodes=2:ppn=4 #PBS -l nodes=100 (any 100 cores) #PBS -l pmem=2gb #PBS -l walltime=10:00 #PBS -l nodes=2:ppn=4:gpus=1,pmem=3gb (8 cores, 2 GPUs, and 24GB memory across 1 or 2 nodes) |
#SBATCH -N 2 --ntasks-per-node=4 #SBATCH -n 100 (any 100 cores) #SBATCH --mem-per-cpu=2G #SBATCH -t10 #SBATCH -N2 --gres=gpu:1 --gres-flags=enforce-binding --mem-per-gpu=12G (exactly 2 nodes, each with 6 cores, 1 GPU, and 12 GB memory) |
Queue Name | #PBS -q inferno |
#SBATCH -qinferno (q stands for "quality of service", not "queue", in Slurm) |
Output/Error Reports | #PBS -j oe #PBS -o Report-$PBS_JOBID.out |
#SBATCH -oReport-%j.out |
Email Notification | #PBS -m abe #PBS -M gburdell3@gatech.edu |
#SBATCH --mail-type=BEGIN,END,FAIL #SBATCH --mail-user=gburdell3@gatech.edu |
Work Directory | cd $PBS_O_WORKDIR |
cd $SLURM_SUBMIT_DIR (default/optional) |
Run Process | hello_world.out |
srun hello_world.out |
MPI Process | mpicc -02 mpi_program.c -o mpi_program mpiexec -n 4 mpi_program program_arguments |
mpicc -02 mpi_program.c -o mpi_program srun mpi_program program_arguments |
Array Job | #PBS -t 1-10 python myscript.py dataset${PBS_ARRAYID} |
#SBATCH --array=1-10 #SBATCH -o %A_%a.out python myscript.py dataset${SLURM_ARRAY_TASK_ID} |
Job Submission Examples¶
Job type | Moab/Torque | SLURM |
---|---|---|
Script Submission | qsub myjobscript.pbs |
sbatch myjobscript.sbatch |
Command-line Submission | qsub -A GT-burdell3 -l walltime=02:00:00 -l nodes=2:ppn=4 -l pmem=1gb -q inferno -o job1.out -e job1.err myscript.py -f myfile.txt |
sbatch -A gts-gburdell3 -t 2:00:00 -N2 --ntasks-per-node=4 --mem-per-cpu=1G -q inferno -o job1.out -e job1.err myscript.py -f myfile.txt |
Interactive Session | qsub -A GT-burdell3 -l nodes=1:ppn=4 -l walltime=02:00:00 -l pmem=128gb -q inferno -I mpiexec -n 4 python my_mpi_script.py exit |
salloc -q inferno -A gts-gburdell3 -N1 --ntasks-per-node=4 --time=02:00:00 --mem=128G srun -n 2 python my_mpi_script.py & srun -n 2 <other_commands> & (Multiple srun can execute in parallel using &) exit |