Updated 2023-03-31

Run BWA on the Cluster

Summary

  • Use module avail bwa to see all available versions of bwa on the cluster.
  • To load BWA in your SBATCH script:
    • Load BWA (this guide focuses on BWA/0.7.4) with module load BWA/0.7.17
  • To run BWA:
    • In your SBATCH script, put all lines executing BWA after the module load lines that loads BWA.

Walkthrough: Run BWA on the Cluster

  • This walktrhough will cover how to use the bwa fa2pc command to convert fasta to pac files.
  • Other uses of BWA can be found here
  • example.fasta can be found here
  • SBATCH Script can be found here

Part 1: The SBATCH Script

#!/bin/bash
#SBATCH -JbwaTest
#SBATCH -A [Account] 
#SBATCH -N2 --ntasks-per-node=4
#SBATCH -t30
#SBATCH -qinferno
#SBATCH -oReport-%j.out

cd $SLURM_SUBMIT_DIR
module load bwa/0.7.17
bwa fa2pac example.fasta pac_prefix
  • The #SBATCH directives are standard, requesting 30 min of walltime and 2 nodes with 4 cores per node. More on #SBATCH directives can be found in the Using Slurm on Phoenix Guide
  • $SLURM_SUBMIT_DIR is simply a variable that represents the directory you submit the SBATCH script from. Make sure the .txt sequence file, and any other files you need are in the same directory you put the SBATCH script in. This line tells the cluster to enter this directory where you have stored the SBATCH script, and look for all the files for the job. If you use $SLURM_SUBMIT_DIR, you need to have all your files in the same folder as your SBATCH script otherwise the cluster won't be able to find the files it needs.
  • Output Files, will also show up in the same directory as the SBATCH script.
  • The module load lines load BWA
  • bwa fa2pac example.fasta pac_prefix executes BWA. It is just a general example line, and BWA has more functionality then just alignment.

Part 2: Submit Job and Check Status

  • Make sure you're in the directory that contains the SBATCH script, the sequence files, and any other files you need.
  • Submit as normal, with sbatch <sbatch script name>. In this case sbatch bwa.sbatch or whatever you called the SBATCH script. You can name the SBATCH scripts whatever you want, just keep the .sbatch at the end
  • Check job status with squeue --job <jobID>, replacing with the jobid returned after running sbatch
  • You can delete the job with scancel <jobID> , replacing with the jobid returned after running sbatch

Part 3: Collecting Results

  • All files created will be in the same folder where your SBATCH script is (same directory you ran sbatch from)
  • The .out file will be found here as well. It contains the results of the job, as well as diagnostics and a report of resources used during the job. If the job fails or doesn't produce the result your were hoping for, the .out file is a great debugging tool.
  • All of the produced pace_prefix files can be found here Report-<jobID>.out should look like this:
---------------------------------------
Begin Slurm Prolog: Feb-02-2023 23:56:25
Job ID:    624389
User ID:   svangala3
Account:   phx-pace-staff
Job name:  bwaTest
Partition: cpu-small
QOS:       inferno
---------------------------------------
[main] Version: 0.7.17-r1188
[main] CMD: bwa fa2pac example.fasta pac_prefix
[main] Real time: 0.026 sec; CPU: 0.005 sec
---------------------------------------
Begin Slurm Epilog: Feb-02-2023 23:56:26
Job ID:        624389
Array Job ID:  _4294967294
User ID:       svangala3
Account:       phx-pace-staff
Job name:      bwaTest
Resources:     cpu=8,mem=8G,node=2
Rsrc Used:     cput=00:00:16,vmem=1600K,walltime=00:00:02,mem=0,energy_used=0
Partition:     cpu-small
QOS:           inferno
Nodes:         atl1-1-03-002-1-[1-2]
---------------------------------------
  • After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
  • Congratulations! You successfully ran BWA on the cluster.