Updated 2021-05-27
Run R in Batch Mode on the Cluster¶
Overview¶
- R is a very powerful tool for statistics and data science
- R can be run in two ways:
- Interactively
- Batch mode, where the program is submitted and runs without interaction
- This guide will focus on how to run R scripts in batch mode
Tips¶
- To run the R Script, you have to specify two things in your
PBS
script:- load the R module with
module load <R version>
- run the R script in batch mode with
R CMD BATCH <scriptname.R?>
- load the R module with
- Submit with
qsub <jobName.pbs>
- Output will show up in a
.Rout
file in the same dir where thePBS
script was submitted from.
Walkthrough: Run an R Script in Batch Mode¶
- This walkthrough will use a simple R script that defines a function called
add(x,y)
, which adds two numbers - The output will print
2
, which is the result ofadd(1 + 1)
Here is the example R script, which you can save as add.R in the same directory as the PBS script.
# Add two integers
add <- function(x,y) {
return(x + y)
}
print(add(1,1))
Part 1: The PBS Script¶
#PBS -N R_test
#PBS -A [Account]
#PBS -l nodes=1:ppn=2
#PBS -l pmem=2gb
#PBS -l walltime=1:00
#PBS -q inferno
#PBS -j oe
#PBS -o addR.out
cd $PBS_O_WORKDIR
module load r
R CMD BATCH add.R
- The
#PBS
directives are standard, requesting just 1 minute of walltime and 1 node with 2 cores. More on#PBS
directives can be found in the PBS guide $PBS_O_WORKDIR
is a variable that represents the directory you submit the PBS script from. Make sure the R script you want to run (in this case, add.R) is in the same directory you put the PBS script.- Output Files will also show up in this dir as well
module load R/3.4.3
loads the 3.4.3 version of R. To see what R versions are available, runmodule avail R
, and load the one you wantR CMD BATCH add.R
tells the cluster to run theadd.R
R script in batch mode. It will print the output to an.Rout
file.
Part 2: Submit Job and Check Status¶
- Make sure you're in the dir that contains the
PBS
Script - Submit as normal, with
qsub <pbs script name>
. In this caseqsub addR.pbs
- Check job status with
qstat -t 22182721
, replacing the number with the job id returned after running qsub - You can delete the job with
qdel 22182721
, again replacing the number with the jobid returned after running qsub
Part 3: Collecting Results¶
- In the directory where you submitted the
PBS
script, you should see aadd.Rout
andaddR.out
files. TheaddR.Rout
file has the results of the jobs. Usecat addR.Rout
or open the file in a text editor to take a look. add.Rout
should look like:
> # Add two integers
>
> add <- function(x,y) {
+ return(x + y)
+ }
>
> print(add(1,1))
[1] 2
>
> proc.time()
user system elapsed
0.348 0.061 0.566
- After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
- Congratulations! You successfully ran an R script in batch mode.