Updated 2023-03-31
Run RAxML on the Cluster¶
Overview¶
- RAxML (Randomized Axelerated Maximum Likelihood) is a program for sequential and parallel Maximum Likelihood based inference of large phylogenetic trees.
- This guide will cover how to run RAxML on the Cluster.
- This is the link to the RAxML Manual
Summary¶
- To run RAxML, you can specify the follwing with the associated flags passed into
raxmlHPC
:- -m : Model of Binary (Morphological), Nucleotide, Multi-State, or Amino Acid Substitution.
- -p : Specify a random number seed for the parsimony inferences. This allows you to reproduce your results and helps debug the program.
- -s : Specify the name of the alignment data file in PHYLIP format.
- -# : Specify the number of alternative runs on distinct starting trees
- -n : Specifies the name of the output file.
- You can find more information and more flags by running
raxmlHPC -h
after loading the requried modules on the Cluster.
Walkthrough: Run RAxML on the Cluster¶
- This walkthrough will cover how to run an ML search on binary data in a PHYLIP file.
- The example used in this walkthough plus many more can be found here.
binary.phy
can be found hereSBATCH
Script can be found here- You can transfer the files to your account on the cluster to follow along. The file transfer guide guide may be helpful.
Part 1: The SBATCH Script¶
#!/bin/bash
#SBATCH -JraxmlTest
#SBATCH -A [Account]
#SBATCH -N1 --ntasks-per-node=2
#SBATCH --mem-per-cpu=2G
#SBATCH -t3
#SBATCH -qinferno
#SBATCH -oReport-%j.out
cd $SLURM_SUBMIT_DIR
module load gcc/10.3.0
module load mvapich2/2.3.6
module load raxml/8.2.12
raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -# 20 -n T5
- The
#SBATCH
directives are standard, requesting just 3 minutes of walltime and 1 node with 2 cores. More on#SBATCH
directives can be found in the Using Slurm on Phoenix Guide $SLURM_SUBMIT_DIR
is a variable that represents the directory you submit the SBATCH script from. Make sure the files you want to use are in the same directory you put the SBATCH script.- Output Files will also show up in this dir as well
module load raxml/8.2.12
loads the 8.2.12 version of RAxML. To see what versions of a software are available, runmodule avail [Software]
, and load the one you want. The other modules are dependencies that must be loaded before RAxML is loaded.raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -# 20 -n T5
will have RAxML carry out 20 ML searches on 20 randomized stepwise addition parsimony trees.
Part 2: Submit Job and Check Status¶
- Make sure you're in the dir that contains the
SBATCH
Script as well as theRAxML
program - Submit as normal, with
sbatch < script name>
. In this casesbatch raxml.sbatch
- Check job status with
squeue --job <jobID>
, replacing with the jobid returned after running sbatch - You can delete the job with
scancel <jobID>
, replacing with the jobid returned after running sbatch
Part 3: Collecting Results¶
- In the directory where you submitted the
SBATCH
script, you should see aReport-<jobID>.out
file which contains the results of the job, 20RAxML_log.T5.Run.0
files, 20RAxML_parsimonyTree.T5.RUN
, 20RAxML_result.T5.RUN
files, aRAxML_info.T5
file, and abinary.phy.reduced
file. Usecat
or open the file in a text editor to take a look. Report-<jobID>.out
should look like this:
---------------------------------------
Begin Slurm Prolog: Dec-25-2022 22:38:26
Job ID: 231349
User ID: svangala3
Account: phx-pace-staff
Job name: raxmlTest
Partition: cpu-small
QOS: inferno
---------------------------------------
IMPORTANT WARNING: Sequences t2 and t3 are exactly identical
IMPORTANT WARNING: Sequences t2 and t4 are exactly identical
IMPORTANT WARNING
Found 2 sequences that are exactly identical to other sequences in the alignment.
Normally they should be excluded from the analysis.
Just in case you might need it, an alignment file with
sequence duplicates removed is printed to file binary.phy.reduced
This is RAxML version 8.2.12 released by Alexandros Stamatakis on May 2018.
With greatly appreciated code contributions by:
Andre Aberer (HITS)
Simon Berger (HITS)
Alexey Kozlov (HITS)
Kassian Kobert (HITS)
David Dao (KIT and HITS)
Sarah Lutteropp (KIT and HITS)
Nick Pattengale (Sandia)
Wayne Pfeiffer (SDSC)
Akifumi S. Tanabe (NRIFS)
Charlie Taylor (UF)
Alignment has 19 distinct alignment patterns
Proportion of gaps and completely undetermined characters in this alignment: 1.36%
RAxML rapid hill-climbing mode
Using 1 distinct models/data partitions with joint branch length optimization
Executing 20 inferences on the original alignment using 20 distinct randomized MP trees
All free model parameters will be estimated by RAxML
GAMMA model of rate heterogeneity, ML estimate of alpha-parameter
GAMMA Model parameters will be estimated up to an accuracy of 0.1000000000 Log Likelihood units
Partition: 0
Alignment Patterns: 19
Name: No Name Provided
DataType: BINARY/MORPHOLOGICAL
Substitution Matrix: Uncorrected
RAxML was called as follows:
raxmlHPC -m BINGAMMA -p 12345 -s binary.phy -# 20 -n T5
Partition: 0 with name: No Name Provided
Base frequencies: 0.627 0.373
Inference[0]: Time 0.038043 GAMMA-based likelihood -119.663773, best rearrangement setting 5
Inference[1]: Time 0.037744 GAMMA-based likelihood -119.663772, best rearrangement setting 5
Inference[2]: Time 0.049993 GAMMA-based likelihood -119.622971, best rearrangement setting 5
Inference[3]: Time 0.037221 GAMMA-based likelihood -119.614407, best rearrangement setting 5
Inference[4]: Time 0.037553 GAMMA-based likelihood -119.614408, best rearrangement setting 5
Inference[5]: Time 0.040189 GAMMA-based likelihood -119.663772, best rearrangement setting 5
Inference[6]: Time 0.038556 GAMMA-based likelihood -119.614407, best rearrangement setting 5
Inference[7]: Time 0.037512 GAMMA-based likelihood -119.622971, best rearrangement setting 5
Inference[8]: Time 0.036832 GAMMA-based likelihood -119.614407, best rearrangement setting 5
Inference[9]: Time 0.028690 GAMMA-based likelihood -119.622971, best rearrangement setting 5
Inference[10]: Time 0.036923 GAMMA-based likelihood -119.663771, best rearrangement setting 5
Inference[11]: Time 0.036736 GAMMA-based likelihood -119.663772, best rearrangement setting 5
Inference[12]: Time 0.056471 GAMMA-based likelihood -119.622971, best rearrangement setting 5
Inference[13]: Time 0.049325 GAMMA-based likelihood -119.663771, best rearrangement setting 5
Inference[14]: Time 0.037075 GAMMA-based likelihood -119.663772, best rearrangement setting 5
Inference[15]: Time 0.037341 GAMMA-based likelihood -119.614408, best rearrangement setting 5
Inference[16]: Time 0.044418 GAMMA-based likelihood -119.614408, best rearrangement setting 5
Inference[17]: Time 0.042716 GAMMA-based likelihood -119.614408, best rearrangement setting 5
Inference[18]: Time 0.037358 GAMMA-based likelihood -119.614408, best rearrangement setting 5
Inference[19]: Time 0.061525 GAMMA-based likelihood -119.622971, best rearrangement setting 5
Conducting final model optimizations on all 20 trees under GAMMA-based models ....
WARNING the alpha parameter with a value of 13.041120 estimated by RAxML for partition number 0 with the name "No Name Provided"
is larger than 10.000000. You should do a model test and confirm that you actually need to incorporate a model of rate heterogeneity!
You can run inferences with a plain substitution model (without rate heterogeneity) by specifyng the CAT model and the "-V" option!
Inference[0] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.0
Inference[1] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.1
Inference[2] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.2
Inference[3] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.3
Inference[4] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.4
Inference[5] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.5
Inference[6] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.6
Inference[7] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.7
Inference[8] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.8
Inference[9] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.9
Inference[10] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.10
Inference[11] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.11
Inference[12] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.12
Inference[13] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.13
Inference[14] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.14
Inference[15] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.15
Inference[16] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.16
Inference[17] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.17
Inference[18] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.18
Inference[19] final GAMMA-based Likelihood: -119.545950 tree written to file /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_result.T5.RUN.19
Starting final GAMMA-based thorough Optimization on tree 8 likelihood -119.545950 ....
Final GAMMA-based Score of best tree -119.545950
Program execution info written to /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_info.T5
Best-scoring ML tree written to: /storage/coda1/pace-admins/svangala3/documentation/site_files/docs/slurm-software/test_directory/raxml/RAxML_bestTree.T5
Overall execution time: 0.875795 secs or 0.000243 hours or 0.000010 days
---------------------------------------
Begin Slurm Epilog: Dec-25-2022 22:38:29
Job ID: 231349
Array Job ID: _4294967294
User ID: svangala3
Account: phx-pace-staff
Job name: raxmlTest
Resources: cpu=2,mem=4G,node=1
Rsrc Used: cput=00:00:08,vmem=644K,walltime=00:00:04,mem=0,energy_used=0Partition: cpu-small
QOS: inferno
Nodes: atl1-1-02-020-27-1
---------------------------------------
- All output files can be found here.
- After the result files are produced, you can move the files off the cluster, refer to the file transfer guide for help.
- Congratulations! You successfully ran RAxML on the cluster.