Updated 2021-05-17
Run BWA on the Cluster¶
Summary¶
- Use
module avail bwa
to see all available versions of bwa on the cluster. - To load BWA in your
PBS
script:- Load its dependent module first with
module load open64/4.5.1
- Load BWA (this guide focuses on BWA/0.7.4) with
module load BWA/0.7.4
- Load its dependent module first with
- To run BWA:
- In your
PBS
script, put all lines executing BWA after themodule load
lines that loads BWA.
- In your
Warning
When aligning sequences with multiple threads (or using the -t
flag in general) you must set the number of threads to the number of processors you requested. Example: if you requested 8 (2 nodes and 4 proc. per node), you would set the thread option as -t 8
Example PBS Script¶
#PBS -N bwaTest
#PBS -A [Account]
#PBS -l nodes=2:ppn=4
#PBS -l walltime=30:00
#PBS -q inferno
#PBS -j oe
#PBS -o bwaResult.out
cd $PBS_O_WORKDIR
module load open64/4.5.1
module load bwa/0.7.4
bwa aln -t 8 RefSeqbwaidx <sequence file>.txt > <output file>.txt.bwa
- The
#PBS
directives are standard, requesting 30 min of walltime and 2 nodes with 4 cores per node. More on#PBS
directives can be found in the PBS guide $PBS_O_WORKDIR
is simply a variable that represents the directory you submit the PBS script from. Make sure the.txt
sequence file, and any other files you need are in the same directory you put thePBS
script in. This line tells the cluster to enter this directory where you have stored thePBS
script, and look for all the files for the job. If you use$PBS_O_WORKDIR
, you need to have all your files in the same folder as yourPBS
script otherwise the cluster won't be able to find the files it needs.- Output Files, will also show up in the same directory as the
PBS
script. - The
module load
lines load BWA and its dependent module bwa aln -t 8
executes BWA. It is just a general example line, and BWA has more functionality then just alignment. The point is to show how the-t
flag is used. Here, 8 threads are specified after-t
, as 8 processors were requested (2 nodes x 4 proc per node)
Submit Job and Check Status¶
- Make sure you're in the directory that contains the
PBS
script, the sequence files, and any other files you need. - Submit as normal, with
qsub <pbs script name>
. In this caseqsub bwa.pbs
or whatever you called thePBS
script. You can name thePBS
scripts whatever you want, just keep the.pbs
at the end - Check job status with
qstat -u username3 -n
, replacing "username3" with your gt username - You can delete the job with
qdel 22182721
, replacing the number with the jobid returned after running qsub - Depending on the resources requested and queue the job is run on, it may take varying amounts of time for the job to start. To estimate the time until the job executes, run
showstart 22182721
, replacing the number with the jobid returned after running qsub. More helpful commands can be found in this guide
Collecting Results¶
- All files created will be in the same folder where your
PBS
script is (same directory you ranqsub
from) - The
.out
file will be found here as well. It contains the results of the job, as well as diagnostics and a report of resources used during the job. If the job fails or doesn't produce the result your were hoping for, the.out
file is a great debugging tool. - You can transfer the resulting files off the cluster using scp or a file transfer service