Updated 2021-05-17

Run RAxML on the Cluster

Overview

  • RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees.
  • This guide will cover how to run RAxML on the Cluster.
  • This is the link to the RAxML Manual

Summary

  • To run RAxML, you can specify the follwing with the associated flags passed into raxmlHPC:
    • -m : Model of Binary (Morphological), Nucleotide, Multi-State, or Amino Acid Substitution.
    • -p : Specify a random number seed for the parsimony inferences. This allows you to reproduce your results and helps debug the program.
    • -s : Specify the name of the alignment data file in PHYLIP format.
    • -# : Specify the number of alternative runs on distinct starting trees
    • -n : Specifies the name of the output file.
  • You can find more information and more flags by running raxmlHPC -h after loading the requried modules on the Cluster.

Walkthrough: Run RAxML on the Cluster

  • This walkthrough will cover how to run an ML search on binary data in a PHYLIP file.
  • The example used in this walkthough plus many more can be found here.
  • binary.phy can be found here
  • PBS Script can be found here
  • You can transfer the files to your account on the cluster to follow along. The file transfer guide may be helpful.

Part 1: The PBS Script

#PBS -N raxmlTest
#PBS -A [Account]
#PBS -l nodes=1:ppn=2
#PBS -l pmem=2gb
#PBS -l walltime=3:00
#PBS -q inferno
#PBS -j oe
#PBS -o raxmlTest.out

cd $PBS_O_WORKDIR
module load gcc/4.9.0
module load openmpi/1.8
module load raxml/8.0.19

raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -# 20 -n T5 

  • The #PBS directives are standard, requesting just 3 minutes of walltime and 1 node with 2 cores. More on #PBS directives can be found in the PBS guide
  • $PBS_O_WORKDIR is a variable that represents the directory you submit the PBS script from. Make sure the files you want to use are in the same directory you put the PBS script.
  • Output Files will also show up in this dir as well
  • module load raxml/8.0.19 loads the 8.0.19 version of RAxML. To see what versions of a software are available, run module avail [Software], and load the one you want. The other modules are dependencies that must be loaded before RAxML is loaded.
  • raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -# 20 -n T5 will have RAxML carry out 20 ML searches on 20 randomized stepwise addition parsimony trees.

Part 2: Submit Job and Check Status

  • Make sure you're in the dir that contains the PBS Script as well as the RAxML program
  • Submit as normal, with qsub <pbs script name>. In this case qsub raxml.pbs
  • Check job status with qstat -t 22182721, replacing the number with the job id returned after running qsub
  • You can delete the job with qdel 22182721 , again replacing the number with the jobid returned after running qsub

Part 3: Collecting Results

  • In the directory where you submitted the PBS script, you should see a raxmlTest.out file which contains the results of the job, 20 RAxML_log.T5.Run.0 files, 20 RAxML_parsimonyTree.T5.RUN, 20 RAxML_result.T5.RUN files, a RAxML_info.T5 file, and a binary.phy.reduced file. Use cat or open the file in a text editor to take a look.
  • raxmlTest.out should look like this.
  • All output files can be found here.
  • After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
  • Congratulations! You successfully ran RAxML on the cluster.