Updated 2023-03-31

Run Scythe on the Cluster

Summary

  • Currently, Scythe/0.994 is available on the Cluster
  • Load Scythe with module load scythe
  • In your SBATCH script, the lines to execute Scythe must go after the lines to load scythe
  • The general command to run Scythe is:
  • scythe -a adapter_file.fasta -o trimmed_sequences.fasta sequences.fastq
  • For more options when running scythe, enter the command scythe or visit the github page

Walkthrough: Run RAxML on the Cluster

  • To create a SBATCH script on the cluster, enter the folder you want to use then type vim <name.sbatch>. Name the file whatever you want, but keep .sbatch at the end. Vim is a text editor that is reliable and effecient to use on Linux.
  • illumina_adapters.fasta can be found here
  • sequences.fastq can be found here
  • SBATCH Script can be found here
  • You can transfer the files to your account on the cluster to follow along. The file transfer guide guide may be helpful.

Example SBATCH Script

#!/bin/bash
#SBATCH -Jscythejob
#SBATCH -A [Account]
#SBATCH -N2 --ntasks-per-node=8
#SBATCH -t2
#SBATCH -qinferno
#SBATCH -oReport-%j.out

cd $SLURM_SUBMIT_DIR
module load scythe/0.994
srun scythe -a illumina_adapters.fasta -o trimmed_sequences.fasta sequences.fastq
  • The #SBATCH directives are standard, requesting 2 hours of walltime and 2 node with 8 cores per node. More on #PBS directives can be found in the Using Slurm on Phoenix Guide
  • $SBATCH_SUBMIT_DIR is simply a variable that represents the directory you submit the SBATCH script from. Make sure the .fastq & .fasta files you want to use are in the same directory you put the SBATCH script. This line tells the cluster to enter this directory where you have stored the files for the job, so it has access to all the files it needs
  • Output Files, will also show up in the same directory as the SBATCH script.
  • module load scythe loads scythe
  • scythe -a adapter_file.fasta -o trimmed_sequences.fasta sequences.fastq runs scythe

Submit Job & Check Job Status

  • Make sure you're in the directory that contains the SBATCH script and the .fasq and .fasta files
  • Submit as normal, with sbatch < script name>. In this case sbatch samtools.sbatch
  • Check job status with squeue --job <jobID>, replacing with the jobid returned after running sbatch
  • You can delete the job with scancel <jobID> , replacing with the jobid returned after running sbatch

Collect Results

  • All files created will be in the same folder where your SBATCH script is (same directory you ran sbatch from)
  • The Report-<jobID>.out file will be found here as well. It contains the results of the job, as well as diagnostics and a report of resources used during the job. If the job fails or doesn't produce the result your were hoping for, the Report-<jobID>.out file is a great debugging tool.
  • The trimmed_sequences.fasta file that gets generated can be found here
  • Report-<jobID>.out should look like this:
---------------------------------------
Begin Slurm Prolog: Jan-26-2023 14:55:15
Job ID:    593270
User ID:   svangala3
Account:   phx-pace-staff
Job name:  scythejob
Partition: cpu-small
QOS:       inferno
---------------------------------------
prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300
prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

prior: 0.300

Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900


Adapter Trimming Complete
contaminated: 1329, uncontaminated: 8671, total: 10000
contamination rate: 0.132900

---------------------------------------
Begin Slurm Epilog: Jan-26-2023 14:55:17
Job ID:        593270
Array Job ID:  _4294967294
User ID:       svangala3
Account:       phx-pace-staff
Job name:      scythejob
Resources:     cpu=16,mem=16G,node=2
Rsrc Used:     cput=00:00:48,vmem=288K,walltime=00:00:03,mem=0,energy_used=0
Partition:     cpu-small
QOS:           inferno
Nodes:         atl1-1-02-014-19-2,atl1-1-02-014-20-2
---------------------------------------