Analysis of Metagenomic Data

Course Objectives
Metagenomics, the sequencing of DNA directly from a sample without first culturing and isolating the organisms, has become the principal tool of “meta-omic” analysis. It can be used to explore the diversity, function, and ecology of microbial communities. The CBW has developed a 3-day course providing an introduction to metagenomic data analysis followed by hands-on practical tutorials demonstrating the use of metagenome analysis tools. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools.
Participants will gain practical experience and skills to be able to:
- Design appropriate microbiome-focused experiments
- Understand the advantages and limitations of metagenomic data analysis
- Devise an appropriate bioinformatics workflow for processing and analyzing metagenomic sequence data (marker-gene, shotgun metagenomic, and metatranscriptomic data)
- Apply appropriate statistics to undertake rigorous data analysis
- Visualize datasets to gain intuitive insights into the composition and/or activity of their data set
Target Audience
Graduates, postgraduates, staff bioinformaticians and PIs working with or about to embark on analysis of marker genes, metagenomic, and metatranscriptomic data from microbiome-focused experiments.
Prerequisites: Basic familiarity with Linux environment and statistical analysis is required. Must be able to complete and understand the following simple Linux tutorial before attending:
You will also require your own laptop computer. Minimum requirements: 1024x768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Send us an email for more information.
Pre-work and pre-readings can be found on the student workshop pages.
Course Material
-
Module 1: Introduction to Metagenomics (Will Hsiao)
Instructor(s): Will Hsiao
Content:
- Review of relevant terms (microbial communities, microbiome, species, metagenome, marker genes, metatranscriptomics)
- Technologies used in meta’omics
- Experimental design and sample preparation considerations
- Meta’omic surveys: primary objectives, types, and workflows
- 16S rRNA genes vs. shotgun sequencing
- Starting points in metagenome data analysis: sequence files, resources, reference databases
Presentation file(s):
PDFDiscussion
-
Module 2: Marker Gene-based Analysis of Taxonomic Composition (Will Hsiao)
Instructor(s): Will Hsiao
Content:
- Advantages of marker-gene based analysis
- Reference databases
- Sequence quality, de-replication, and error correction
- Alpha- and Beta- diversity measures
- Comparison of samples based on taxonomic compositions
Presentation file(s):
PDFLab Practical
-
Module 3: Introduction to PICRUSt (Morgan Langille)
Instructor(s): Morgan Langille
Content:
- Approaches for metagenomic inference
- An overview of the PICRUSt approach
- Limitations to metagenomic inference
- PICRUSt 2.0: Functional predictions from 16S data
Presentation file(s):
PDFLab Practical
Instructors(s): Morgan Langille
Content:
- Functional predictions from 16S data
Presentation file(s):
PDF -
Module 4: Metagenomic Taxonomic and Functional Composition
Instructor(s): Gavin Douglas
Content:
- Contrast taxonomic and functional annotation
- Discuss the difficulties of determining the taxonomic and functional composition of a metagenomic sample
- Comparison of taxonomic assignment methods
- {"title"=>"Binning-based methods (assigning taxonomy to most reads)", "content"=>["Marker-based methods (using only some of the shotgun sequence data)"]}
- Overview of functional databases: KEGG (KOs, Modules, Pathways), MetaCyc, COG, SEED, GO, PFAM, your own custom database
- An overview of several existing methods
- An in-depth description of Metaphlan2 and HUMAnN2
Presentation file(s):
PDFLab Practical
Instructors(s): Gavin Douglas
Content:
- Assign taxonomy with MetaPhlAn2
- Functionally annotate reads using HUMAnN2
- Annotate reads using HUMAnN2
- Visualize taxonomic and functional differences across samples
Presentation file(s):
PDF -
Module 5: Metagenome Assembly, Binning, and Extracting Genomes from Metagenomes (Laura Hug)
Instructor(s): Laura Hug
Content:
- Assembling metagenomes
- Binning
- Pulling genomes from metagenomes
Presentation file(s):
PDFLab Practical
-
Module 6: Metatranscriptomics (John Parkinson)
Instructor(s): John Parkinson
Content:
- Gene expression in a microbiome vs. functional composition
- RNA-seq applied to microbiomes:
- {"title"=>"Experimental design: additional considerations", "content"=>["Sample collection, storage and preparation", "Processing metatranscriptomics reads: filters and assembly"]}
- {"title"=>"Functional and taxonomic inference from metatranscriptome reads", "content"=>["Tools for functional inference: pathways, processes and networks"]}
- GIST for taxonomic inference
Presentation file(s):
PDFLab Practical
Instructors(s): John Parkinson
Content:
- Reads to function and RPKM statistics
- Reads to taxonomy using GIST
- Statistics using ALDEx2
- Functional and taxonomic visualization with Cytoscape
Presentation file(s):
PDF -
Module 7: Statistical Tests for Metagenomics (Robert Beiko)
Instructor(s): Rob Beiko
Content:
- Appropriate statistical tests for metagenomics
Presentation file(s):
PDFLab Practical
-
Module 8: Biomarker selection (Fiona Brinkman)
Instructor(s): Fiona Brinkman
Content:
- Benefits and applications of biomarkers
- Types of markers - taxonomic, functional
- Examples of existing biomarkers
- {"title"=>"Methods for identifying new markers", "content"=>["Normalization, copy number variation, and other considerations", "Finding differential features: categorical, correlative", "Ranking features", "Network-based analysis"]}
- Towards a genetic test: Designing PCR/qPCR primers/tests
- Example of biomarker ID success
- General considerations, cautionary notes
Presentation file(s):
PDF