Updated 2021-05-17

FastQC

Run FastQC in Batch Mode on the Cluster

Overview

  • This guide will cover how to run FastQC in batch mode
  • Running FastQC in batch mode means you have an input file, such as a .fastq file that you execute through a PBS script. After you submit the PBS script, the job will run on its own without any need to watch it. FastQC will create a .zip file with images and reports, as well as a .html report
  • FastQC also can be run interactively (with a gui)

Summary

  • Run module avail fastqc to see all available FastQc versions on the cluster
  • In the PBS script:
    • Load FastQC with module load fastqc
    • Run FastQC with fastqc <input file name>

Warning

Users have reported that loading other Perl modules can disable the use of fastqc on RHEL6 systems. A workaround is to unload all Perl modules before using fastqc.

Walkthrough: Run FastQC in Batch Mode on the Cluster

Part 1: The PBS Script

#PBS -N fastqcTest
#PBS -A [Account]
#PBS -l nodes=1:ppn=4
#PBS -l walltime=2:00
#PBS -q inferno
#PBS -j oe
#PBS -o fastqc.out

cd $PBS_O_WORKDIR
module load fastqc
fastqc SRR081241.filt.fastq
  • The #PBS directives are standard, requesting just 2 minutes of walltime and 1 node with 4 cores. More on #PBS directives can be found in the PBS guide
  • $PBS_O_WORKDIR is simply a variable that represents the directory you submit the PBS script from. Make sure the .fastq file you want to run is in the same directory you put the PBS script. This line tells the cluster to enter this directory where you have stored the files for the job, so it has access to all the files it needs
  • Output Files, such as the resulting .html report and .zip file of results, will also show up in the same directory as the PBS script.
  • module load fastqc loads FastQC 0.11.2
  • fastqc SRR081241.filt.fastq executes FastQC on the input file

Part 2: Submit the Job and Check Status

  • Make sure you're in the directory that contains the PBS script and the .fastq file
  • Submit as normal, with qsub <pbs script name>. In this case qsub fastq.pbs
  • Check job status with qstat -u username3 -n, replacing "username3" with your gt username
  • You can delete the job with qdel 22182721, replacing the number with the jobid returned after running qsub

Part 3: Collecting Results

  • In the directory where you submitted the PBS script, you should see a couple of newly generated files, including fastqc.out, SRR081241.filt_fastqc.html, and SRR081241.filt_fastqc.zip.
  • fastqc.out contains the status of the job and how the analysis went, the .html file can be opened with any web browser and contains the full report, the .zip file contains all the resulting images, graphs, and resources that are displayed in the html report as well as the report itself.
  • Open fastqc.out using a text editor such as vim with vim fastqc.out. The file should look something like this:
Job name:   fastqcTest
Queue:      inferno
End PBS Prologue Mon Nov 26 08:54:52 EST 2018
---------------------------------------
Started analysis of SRR081241.filt.fastq
Approx 5% complete for SRR081241.filt.fastq
Approx 10% complete for SRR081241.filt.fastq
Approx 15% complete for SRR081241.filt.fastq
Approx 20% complete for SRR081241.filt.fastq
Approx 25% complete for SRR081241.filt.fastq
Approx 30% complete for SRR081241.filt.fastq
Approx 35% complete for SRR081241.filt.fastq
Approx 40% complete for SRR081241.filt.fastq
Approx 45% complete for SRR081241.filt.fastq
Approx 50% complete for SRR081241.filt.fastq
Approx 55% complete for SRR081241.filt.fastq
Approx 60% complete for SRR081241.filt.fastq
Approx 65% complete for SRR081241.filt.fastq
Approx 70% complete for SRR081241.filt.fastq
Approx 75% complete for SRR081241.filt.fastq
Approx 80% complete for SRR081241.filt.fastq
Approx 85% complete for SRR081241.filt.fastq
Approx 90% complete for SRR081241.filt.fastq
Approx 95% complete for SRR081241.filt.fastq
Analysis complete for SRR081241.filt.fastq
---------------------------------------
Begin PBS Epilogue Mon Nov 26 08:55:00 EST 2018
Job ID:     22971857.shared-sched.pace.gatech.edu
User ID:    shollister7
  • To view the full report, you can use firefox on the cluster to view the html report
  • Run firefox SRR081241.filt_fastqc.html

Caution

You must be logged in with display enabled, meaning when you log in you have to use -X or -Y, otherwise the display (firefox window) cannot be opened

  • logout and log back in with ssh -X gtusername3@login-s.pace.gatech.edu if the display won't open
  • Report will look something like this:

Screenshot

  • After the result files are produced, you can move the .zip file as well as any other files off the cluster. Refer to the file transfer guide for help.
  • Congratulations! You successfully ran a FastQC program in batch mode on the cluster.

Run FastQC Interactively on the Cluster

Overview

  • Running FastQC on the cluster follows the same general steps as any interacive job. These steps are:
    • Set up VNC Session
    • Load fastqc module. Use module avail fastqc to see what versions of fastqc are available
    • Run FastQC with the command fastqc

Set up VNC Session

  • Please see the VNC guide for instructions on how to set up the Interactive VNC session

Load FastQC

  • Open a terminal in the vnc window by clicking top left Applications > System Tools > scroll down to terminal
  • All commands here on will be typed in terminal in VNC
  • On the cluster, fastqc/0.10.1 and fastqc/0.11.2 are available
  • module load fastqc/0.11.2 will load fastqc. Replace the number at the end with the version number you want to load

Run FastQC

  • Run with the command fastqc * Screenshot