workshop image
Course Description

Microbes are everywhere and the study of microbiomes to elucidate the impact of microbes in various ecosystems has become a multidisciplinary science involving microbiology, statistics, computer science, and molecular biology. Microbiomics, the study of microbes without first culturing and isolating the organisms, has become the principal approach to exploring the diversity, function and ecology of microbial communities.  The CBW has developed a 2-day course providing an introduction to metagenomic data analysis (both read-based and assembly-based) followed by hands-on practical tutorials for each session to demonstrate the use of relevant bioinformatics and statistical tools. Modules will consist of lectures covering both theoretical and practical components followed by hands-on bioinformatic tutorials guided by instructors and teaching assistants.

Course Objectives

Participants will gain practical experience and skills to be able to:

  • Understand the advantages and limitations of metagenomic data analysis
  • Devise an appropriate bioinformatics workflow for processing and analyzing microbiome shotgun metagenomic sequence data
  • Perform both read-based and assembly-based analyses of shotgun metagenomic sequence data
  • Apply appropriate statistics to undertake rigorous data analysis
Target Audience

Graduates, postgraduates, staff bioinformaticians and PIs working with or about to embark on analysis of shotgun metagenomic sequence data from microbiome-focused experiments. 


Participants should be comfortable with reading and writing basic R or Bash, or be enrolled in the Beginner Microbiome Analysis course. Participants should have some prior experience with microbiome sequencing data (for example, have completed a marker gene analysis of microbiome data).

You will require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 2.4GHz CPU, 8GB RAM, 100GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements).

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Introduction to metagenomics and read-based profiling

  • Challenges and benefits of metagenomic data
  • Comparison of major approaches (read based vs assembly based)
  • Importance of sequencing depth and host contamination
  • Long read vs short read
  • Approaches for assigning taxonomy to shotgun metagenomic data

Lab practical

  • Quality control of short reads and removal of host contamination using Kneaddata in the Unix command line
  • Taxonomic profiling of short reads using Kraken and MetaPhlAn in the Unix command line
  • Analysis of taxonomic assignment results using Phyloseq in R

Module 2: Metagenomic Assembly and Binning

  • Overview of binning theory/approaches, advantages/disadvantages
  • Quality metrics for assembly and binning 
  • Genome-resolved metagenomics using the Anv’io ecosystem

Lab practical

  • Assembly of short reads into contigs, binning of contigs and refinement of bins into Metagenome Assembled Genomes (MAGs) using Anv’io in the Unix command line
  • Metrics for assessing MAG quality using Anv’io in the Unix command line

Module 3: Metagenomic functional annotation

  • Approaches for assigning functions to shotgun metagenomic data
  • Microbial gene annotation and normalizations
  • Methods for assigning functions to reads and the differences in approaches for assigning functions to all reads versus specific functions such as Antimicrobial Resistance (AMR) or carbohydrate metabolism genes
  • Stratifying functions by taxonomy

Lab practical

  • General functional annotation using MMSeqs and HUMAnN in the Unix command line
  • AMR annotation of reads using CARD RGI in the Unix command line
  • Visualization of functions stratified by taxonomy using JarrVis in R

Module 4: Advanced microbiome statistics

  • How to incorporate confounding variables into statistical analyses of microbiome data (e.g. age and sex in human-based studies)
  • Using machine learning (e.g. Random Forests) for diagnostics or identification of important microbial features

Lab practical

  • Statistics and machine learning in R
Workshop Details:

Duration: 2 days

Start: May 29, 2024

End: May 30, 2024

Location: St. John's, Newfoundland and Labrador Canada
Course Mode:

Status: Registration Closed

Workshop Ended

CAD $495 for applications received between February 7, 2024 to April 10, 2024
CAD $695 for applications received between April 11, 2024 to May 15, 2024
Limited to: 30 participants
Lead Instructors:
Open Access Content:

Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.


Posted on: