Description Target Audience Prerequisites Outline

Course Description

High-throughput sequencing of RNA libraries (RNA-seq) has become increasingly common and largely supplanted gene microarrays for transcriptome profiling. When processed appropriately, RNA-seq data has the potential to provide a considerably more detailed view of the transcriptome. The CBW has developed a 3-day course providing an introduction to RNA-seq data analysis followed by integrated tutorials demonstrating the use of popular RNA-seq analysis packages. The tutorials are designed as self-contained units that include example data (Illumina paired-end RNA-seq data) and detailed instructions for installation of all required bioinformatics tools (HISAT, StringTie, etc.).

Course Objectives

Participants will gain practical experience and skills to be able to:

Perform command-line Linux based analysis on the cloud
Assess quality of RNA-seq data
Align RNA-seq data to a reference genome
Estimate known gene and transcript expression
Perform differential expression analysis
Discover novel isoforms
Visualize and summarize the output of RNA-seq analyses in R
Assemble transcripts from RNA-Seq data.

Target Audience

Graduates, postgraduates, and PIs working or about to embark on an analysis of RNA-seq data. Attendees may be familiar with some aspect of RNA-seq analysis (e.g. gene expression analysis) or have no direct experience.

Prerequisites

Basic familiarity with Linux environment and S, R, or Matlab.

You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, please contact support@bioinformatics.ca for other possible options.

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Introduction to Cloud Computing (Obi Griffith)

Introduction to cloud computing concepts

Lab Practical

Learn to configure, launch, and connect to an Amazon cloud instance.

Module 2: Introduction to RNA sequencing and analysis (Malachi Griffith)

Basic introduction to biology of RNA-seq
Experimental design and analysis considerations
Commonly asked questions

Lab Practical

Introduction to the test data
Examine and understand the format of raw FastQ files
Obtain reference genomes (fasta) and gene annotation resources (GTF/GFF)
Perform pre-alignment QC

Module 3: RNA-Seq alignment and visualization (Fouad Yousif)

RNA-seq alignment challenges and common questions
Alignment strategies
Introduction to HISAT2
Introduction to the BAM and BED formats
Basic manipulation of BAMs with samtools, Picard, etc.
Visualization of RNA-seq alignments – IGV
Alignment QC Assessment
BAM read counting and determination of variant allele expression status

Lab Practical

Run HISAT2 with parameters suitable for gene expression analysis
Use samtools to explore and manipulate the features of the SAM/BAM files
Use IGV to visualize HISAT2 alignments, view a variant position, load exon junctions files, etc.
Determine BAM-read counts at a variant position
Use samtools flagstat, samstat, FastQC to assess quality of alignments

Module 4: Expression and differential expression (Obi Griffith)

Expression estimation for known genes and transcripts
FPKM/TPM expression estimates vs. raw counts
Differential expression methods
Downstream interpretation of expression and differential expression estimates

Lab Practical

Generate gene/transcript expression estimates with StringTie
Perform differential expression analysis with Ballgown
Summarize and visualize differential expression results

Module 5: Reference free alignment (Malachi Griffith)

Explore the use of Kallisto to get abundance estimates without first aligning to a reference

Module 6: Isoform discovery and alternative expression (Malachi Griffith)

Explore use of StringTie in reference annotation based transcript (RABT) assembly mode and de novo assembly mode. Both modes require a reference genome sequence.

Lab Practical

Run StringTie in alternate modes more conducive to isoform discovery and explore the results

Module 7: Genome-Free De Novo Transcript Assembly (Brian Haas)

Reconstructing transcripts using Trinity
Genome-free transcript quantification and differential expression analysis

Lab Practical

Assemble RNA-Seq transcripts

Module 8: Functional Annotation and Analysis of Transcripts (Brian Haas)

Lab Practical

Explore TrinotateWeb for navigating transcript annotation and expression data

Course material available here

Workshop Details:

Duration: 3 days

Start: Jul 11, 2019

End: Jun 13, 2019

Location: Toronto, Ontario Canada

Course Mode:
Mode Filter

Status: Registration Closed

Workshop Ended

Offers:

for applications received between to

Limited to: 30 participants

Lead Instructors:

Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.

Funders

Posted on:

April 21, 2022

(2019) Informatics for RNA-seq Analysis

Module 1: Introduction to Cloud Computing (Obi Griffith)

Lab Practical

Module 2: Introduction to RNA sequencing and analysis (Malachi Griffith)

Lab Practical

Module 3: RNA-Seq alignment and visualization (Fouad Yousif)

Lab Practical

Module 4: Expression and differential expression (Obi Griffith)

Lab Practical

Module 5: Reference free alignment (Malachi Griffith)

Module 6: Isoform discovery and alternative expression (Malachi Griffith)

Lab Practical

Module 7: Genome-Free De Novo Transcript Assembly (Brian Haas)

Lab Practical

Module 8: Functional Annotation and Analysis of Transcripts (Brian Haas)

Lab Practical