Updated 2021-05-17

Run MASH on the Cluster

Overview

  • This guide will cover how to load and use mash/1.0.2
  • mash requires multiple other modules to be loaded before it itself can be loaded.

Load Mash

  • To use Mash, load these module in your PBS script using module load:
    • module load gcc/4.9.0
    • module load capnproto-c++/0.53
    • module load boost/1.57.0
    • module load zlib/1.2.8
    • module load autoconf/2.6

Walkthrough: Run Mash on the Cluster

Part 1: The PBS Script

#PBS -N mashTest
#PBS -A [Account]
#PBS -l nodes=2:ppn=4
#PBS -l pmem=2gb
#PBS -l walltime=2:00
#PBS -q inferno
#PBS -j oe
#PBS -o mashTest.out

cd $PBS_O_WORKDIR
module load gcc/4.9.0
module load capnproto-c++/0.5.3
module load boost/1.57.0
module load zlib/1.2.8
module load autoconf/2.69
module load mash

mash sketch genome1.fna
mash sketch genome2.fna
mash dist genome1.fna.msh genome2.fna.msh
  • The #PBS directives are standard, requesting just 1 minute of walltime and 2 node with 4 cores. More on #PBS directives can be found in the PBS guide
  • $PBS_O_WORKDIR is simply a variable that represents the directory you submit the PBS script from.

Warning

Make sure the .fna files you want to run are in the same directory you put the PBS script.

  • Output Files will also show up in this dir as well
  • module load lines load the dependent programs as well as Mash
  • Lines that begin with mash execute the program

Part 2: Submit Job and Check Status

  • Make sure you're in the directory that contains the PBS script and the .fna files
  • Submit as normal, with qsub <pbs script name>. In this case qsub mash.pbs
  • Check job status with qstat -u username3 -n, replacing "username3" with your gtusername
  • You can delete the job with qdel 22182721, replacing the number with the jobid returned after running qsub

Part 3: Collecting Results

  • In the directory where you submitted the PBS script, you should see a couple of newly generated files, including genome1.fna.msh and mashTest.out
  • Open mashTest.out in a text editor to take a look, using Vim this would be vim mashTest.out
  • mashTest.out should look something like this:
Job name:   mashTest
Queue:      inferno
End PBS Prologue Fri Nov  9 09:43:47 EST 2018
---------------------------------------
Sketching genome1.fna...
Writing to genome1.fna.msh...
Sketching genome2.fna...
Writing to genome2.fna.msh...
genome1.fna genome2.fna 0.0222766   0   456/1000
---------------------------------------
Begin PBS Epilogue Fri Nov  9 09:43:49 EST 2018
Job ID:     22865014.shared-sched.pace.gatech.edu
  • After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
  • Congratulations! You successfully ran a Mash program on the cluster.