Updated 2022-08-23
pylauncher (Slurm)¶
Note
This page describes how to use pylauncher on clusters with Slurm. For info about using pylauncher on clusters with the Torque scheduler (using qsub
), see here.
pylauncher is a python-based parametric job launcher which allows for the execution of many small jobs in parallel. This is a utility for performing HTC-style work-flows on HPC systems, such as PACE.
Let's say that you need to run a large number of serial jobs, such as 5000. Your cluster may allow you to allocate a certain number of cores at any given time, such as 100 cores. In addition, policy may limit you to only have a certain number of jobs to run at any given time, say 2000 jobs. Under these conditions, it in not possible to simple enqueue all the jobs on the system in the most simple manner. However, using a tool such as pylauncher will allow you to submit one parallel job that will run all of your 5000 calculations while taking advantage of the number of cores you might be able. While there is a trade-off in the at it may take longer for a 100 core to start, it is possible to run them in a way that can best use the system resources without impact other users.
In a simple case, the pylaucnher takes a file with a set of commands lines, and gives them out to the cores in a cyclicly manner. This is not a optimal solution, since in most cases, the individual commands lines may take widely varying amounts of time. The pyluancher has a dynamic manager which will keep track of resources and use them as they become available.
In more ambitious use cases, the set of commands may not be known at the time of scheduling, but can be created programmaticly. For instance, after running a corse search over a parameter space, you could run a analysis and find where to focus your searchs. This would lead to an efficient use since it will allow you to decrease either decrease the computational resouces (an overly fine mesh search) or the time waiting for an answer which will then need to constructioned manually and resubmitted.
The pylauncher utility was developed by the Texas Advanced Computing Center, by Dr. Victor Eijkhout. The official GitHub repository is at the TACC GitHub. Support for PBS-based schedulers was contributed by Dr. Christopher Blanton. A version with srun
support (used here) is available in a fork at Ron Rahaman's GitHub.
Using pylauncher on PACE systems¶
Most common use case¶
The pylauncher is installed as a module on the PACE systems. It can be used by
$ module load anaconda3
$ module load pylauncher
In some cases, it may be necessary to have a custom environment installed for other packages that may used. The key is that you need an environment which has the paramiko
Python module installed.
Example Cases¶
Simple Serial Case using pylauncher¶
Running a serial pylauncher job requires the writing of a simple Python script. For example:
from pylauncher import pylauncher
pylauncher.ClassicLauncher("your_jobfile.in")
The driver function pylauncher.ClassicLauncher
only needs one argument for a contant single processor job (it will find out the resources in the background so that does need to
specified). The argument is the name of the file which containts the command lines. If the file is not located in the located with the script, the location will need to be specified.
The command line file ("testfile_serial.in"
in the above case, is simple a collection of the commands to be be ran within each process. These lines will inherit the environment of the main process. Lines that start with #
are ignored as are blank lines.
There are debug options that can be set by means of an additional keyword debug='job+host+task'
, if needed, so the function becomes
pylauncher.ClassicLauncher("your_jobfile.in", debug='job+host+teask')
Simple Serial Case pylauncher script (test_serial.py
)¶
from pylauncher import pylauncher
pylauncher.ClassicLauncher("testfile_serial.in")
Simple Serial Case sbatch Script (test_serial.sbatch
)¶
#!/usr/bin/env bash
#SBATCH --job-name=test_serial
#SBATCH --account=<your account>
#SBATCH --partition=hive
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=6
#SBATCH --time=00:05:00
#SBATCH --output=%x-%j.out
module load anaconda3 pylauncher
python3 test_serial.py
Simple Input File (testfile_serial.in
)¶
echo 0 >> /dev/null 2>&1; sleep 21
echo 1 >> /dev/null 2>&1; sleep 30
echo 2 >> /dev/null 2>&1; sleep 8
echo 3 >> /dev/null 2>&1; sleep 34
echo 4 >> /dev/null 2>&1; sleep 39
echo 5 >> /dev/null 2>&1; sleep 9
Non-Distributed Memory Parallel Workflow¶
It is possible to also run parallel tasks using the pylauncher.ClassicLauncher
function by the addition of the cores
keyword.
The function call becomes
pylauncher.ClassicLauncher("your_jobfile.in", cores=3)
which will run the command with three cores for each command line.
Below, an example code is shown with the scripts needed to run it.
Example Executable Code¶
The code used is a simple hello world for pthreads (named pth_hello.c
)
The executable takes the number of threads to use on the command line.
The difference from the serial example is the inclusion
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
int thread_count;
void* Hello(void* rank);
int main(int argc, char* argv[]) {
long thread;
pthread_t* thread_handles;
thread_count = strtol(argv[1], NULL, 10);
thread_handles = malloc ( thread_count*sizeof(pthread_t));
for (thread=0; thread < thread_count; thread++)
pthread_create(&thread_handles[thread], NULL, Hello, (void*) thread);
printf("Hello from the main thread\n");
for (thread=0; thread<thread_count; thread++)
pthread_join(thread_handles[thread], NULL);
free(thread_handles);
return 0;
}
void* Hello(void* rank) {
long my_rank = (long) rank;
printf("Hello from thread %ld of %d\n", my_rank, thread_count);
return NULL;
}
An example Makefile
for this executable is
CC = gcc
DEBUG = -g -Wall
PTHREADSLIB = -lpthread
pth_hello : pth_hello.c
$(CC) $(DEBUG) -o pth_hello pth_hello.c $(PTHREADSLIB)
clean:
rm -f pth_hello
.PHONY: clean
Example PBS Script for constant number of processors (constant_sn_job.sbatch
)¶
#!/usr/bin/env bash
#SBATCH --job-name=constant_sn_job
#SBATCH --account=<your account>
#SBATCH --partition=hive
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=20
#SBATCH --time=00:10:00
#SBATCH --output=%x-%j.out
module load anaconda3 pylauncher
python3 constant_sn_launcher.py
Example Python script for constant number of processors (constand_sn_launcher.py
)¶
from pylauncher import pylauncher
pylauncher.ClassicLauncher("constant_sn_job.in", cores=3)
Variable number of processors for a non-MPI parallel job¶
It is also possible to use a changing number of processors for a non-MPI
parallel job. The key change is that the cores
keyword becomes cores="file"
and the information is contained with the command line file.
The pylauncher.ClassicLauncher
function call becomes
pylauncher.ClassicLauncher("your_jobfile.in", cores="file")
and the command line file becomes something like
2,./pth_hello 2
5,./pth_hello 5
3,./pth_hello 3
4,./pth_hello 4
5,./pth_hello 5
where the first number is the number of processors to use for that command line.
Note
It is still up to you to tell your program how many processors to use in that case, so be careful to do that correctly.
Example sbatch script for variable number of processors (variable_sn_job.sbatch
)¶
#!/usr/bin/env bash
#SBATCH --job-name=variable_sn_job
#SBATCH --account=<your account>
#SBATCH --partition=hive
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=20
#SBATCH --time=00:10:00
#SBATCH --output=%x-%j.out
module load anaconda3 pylauncher
python3 variable_sn_launcher.py
Example Python script for variable number of processors (variable_sn_launcher.py
)¶
from pylauncher import pylauncher
pylauncher.ClassicLauncher("variable_sn_job.in", cores="file")
Example command line file for variable number of processors (variable_sn_job.in
)¶
2,./pth_hello 2
5,./pth_hello 5
3,./pth_hello 3
4,./pth_hello 4
5,./pth_hello 5
3,./pth_hello 3
3,./pth_hello 3
3,./pth_hello 3
4,./pth_hello 4
4,./pth_hello 4
4,./pth_hello 4
2,./pth_hello 2
3,./pth_hello 3
4,./pth_hello 4
5,./pth_hello 5
4,./pth_hello 3
Running an MPI Workflow using Pylauncher¶
Distributed memory parallel programs form an important set of tools for many users. As more HPC programs begin to seek to create large scale computational sets of data, it become important to be able to support those users with this style of workflow.
The MPI workflow is a little more complicated than the other uses that have been detailed and requires the use of a different Launcher function to account for how the jobs are started. The function to be called is pylauncher.SrunLauncher
, the call in the file can be
pylauncher.SrunLauncher("your_jobfile.in", cores="file")
in a similar form to the variable case above.
Example executable source code (mpi_hello.c
)¶
#include <stdlib.h>
#include <stdio.h>
#include "unistd.h"
#include "mpi.h"
int main(int argc,char **argv) {
int jobno,slp,mytid,ntids;
char outfile[5+5+5+1];
FILE *f;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&ntids);
MPI_Comm_rank(MPI_COMM_WORLD,&mytid);
if (argc<2) {
if (mytid==0) printf("Usage: parallel id slp\n");
}
jobno = atoi(argv[1]);
slp = atoi(argv[2]);
MPI_Barrier(MPI_COMM_WORLD);
sprintf(outfile,"pytmp-%04d-%04d",jobno,mytid);
f = fopen(outfile,"w");
fprintf(f,"%d/%d working\n",mytid,ntids);
fclose(f);
if (mytid==0) {
printf("Job %d on %d processors\n",jobno,ntids);
}
sleep(slp);
MPI_Finalize();
return 0;
}
which can be built with this Makefile:
MPICC=mpicc
DEBUG= -g -Wall
mpi_hello : mpi_hello.c
$(MPICC) $(DEBUG) -o mpi_hello mpi_hello.c
clean:
rm -f mpi_hello
.PHONY: clean
Example sbatch script (srun_job.sbatch
)¶
#!/usr/bin/env bash
#SBATCH --job-name=srun_job
#SBATCH --account=<your account>
#SBATCH --partition=hive
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=2
#SBATCH --time=00:10:00
#SBATCH --output=%x.%j.out
#SBATCH --mem-per-cpu=200M
module load anaconda3 pylauncher
python3 srun_launcher.py
Example MPI pylauncher script (srun_launcher.py
)¶
from pylauncher import pylauncher
pylauncher.SrunLauncher("srun_job.in", cores="file")
Example MPI command line file (srun_job.in
)¶
4,./parallel 0 10
4,./parallel 1 10
8,./parallel 2 10
4,./parallel 3 10
4,./parallel 4 10
8,./parallel 5 10
Conclusion¶
It is hoped that pylauncher can be a good replacement for HTC Launcher, GNU Parallel, and job arrays.