Infectious Disease Genomic Epidemiology (2017) New

Course Objectives

A poster announcing this workshop can be found here

With increasing adoption of Next Generation Sequencing technologies to infectious disease surveillance and outbreak investigations, genomic epidemiology (combining pathogen genomics data with epidemiological investigations to track the spread of infectious diseases) is poised to change the practices of public health and infection controls and provides unprecedented amount of data for pathogen evolution studies.

The CBW has developed a 3-day course providing an introduction to genomic epidemiology analysis followed by hands-on practical tutorials demonstrating the use selected analysis tools. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools or access to publicly available web applications.

Participants will gain practical experience and skills to be able to:

  • Understand next generation sequencing (NGS) platforms as applied to pathogen genomics and metagenomics sequencing
  • Analyze NGS data for pathogen surveillance and outbreak investigations
  • Analyze antimicrobial resistance genes
  • Detect emerging pathogens in metagenomics data
  • Perform phylogeographic analysis
  • Use different visualization tools for genomic epidemiology analysis

Target Audience

Graduates, postgraduates, staff bioinformaticians, laboratory technologists, medical microbiologists and PIs working with or about to embark on analysis of genomic and metagenomics data for epidemiological investigations./p>

Prerequisites for attendance: Basic familiarity with Linux environment and S, R, or Matlab. Must be able to complete and understand the following simple Linux and R tutorials before attending:

You will also require your own laptop computer. Minimum requirements: 1024x768 screen resolution, 1.5GHz CPU, 1GB RAM, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact for more information.

Course Outline

Day 1

Module 1: Introduction to Public Health Microbiology and Genomic Epidemiology (2017) (Instructor: William Hsiao)

  • Review of relevant terms and concepts
  • Review of next generation sequencing and its application to microbiology
  • Overview of sequence data processing
  • The importance of metadata quality and curation
  • Overview of different types of genomic epidemiology analysis
  • Computing Resources and Requirements
  • Setting up Amazon Web Service

Module 2: Pathogen Genomic Analysis I (2017) (Instructor: Gary Van Domselaar)

  • Review of phylogenetics
  • Overview of single-nucleotide-variants (SNVs)
  • Whole genome SNVs analysis using NGS data
  • Reference-based vs. Reference-free SNVs analysis
  • Advantages and complications associated with SNVs analysis
  • Integration of epidemiological data for SNVs analysis

Lab Practical:
Building and interpreting phylogenetic trees using SNVs (SNVPHYL pipeline (manuscript in preparation) available in IRIDA platform will be used for this exercise; additional command lines tools will also be covered)

Module 3: Pathogen Genomic Analysis II (2017) (Instructor: Eduardo Taboada)

  • Overview of Multi-locus Sequence Typing (MLST) and Whole-Genome or Core-Genome (WG/CG) -MLST analysis
  • Concept of Nomenclature database and global surveillance
  • Bacterial typing
  • Publicly available MLST and WG/CG-MLST databases
  • Assembly-based vs. Assembly-free MLST analysis
  • Integration of epidemiological data for MLST analysis

Lab practical:
Building and interpreting Cladograms and Molecular Typing using WG/CG-MLST

Integrated Assignment Part 1:
Applying SNVs and WG-MLST analysis on a set of microbial genomes to identify clusters and to infer outbreaks

Keynote: Open Bioinformatics Takes Centre Stage in Infectious Disease (Speaker: Fiona Brinkman)

Day 2

Module 4: Antimicrobial Resistant Gene (AMR) Analysis (2017) (Andrew McArthur)

  • Review of available antimicrobial resistant (AMR) resources
  • The Comprehensive Antimicrobial Resistance Database (CARD) Overview
  • Annotation of AMR using ResFams and Active Sites annotation
  • Identification of antimicrobial resistance genes
  • Challenges of Detecting AMR in Metagenomics

Lab Practical:

  • Using CARD website
  • Using Resistance Gene Identifier (RGI) and other bioinformatics tools to identify and characterize AMR genes

Module 5: Phylogeographic Analysis (2017) (Instructor: Robert Beiko)

  • Overview of Phylogeographic analysis
  • Introduction to GenGIS
  • Introduction to Phylocanvas

Lab Practical:
Using IRIDA and GenGIS to process genomic data and geographic (location) data for outbreak detection, analysis, and visualization.

Integrated Assignment Part 2:
Applying Antimicrobial Resistance Gene analysis and Phylogeograpic analysis on a set of microbial genomes and accompanied metadata (epidemiological data consist of location, time, etc.) to characterize clusters and to infer outbreaks

Day 3

Module 6: Emerging Pathogen Detection and Identification using Metagenomic Samples (2017) (Instructor: Gary Van Domselaar)

  • Overview of Pathogen Identification using metagenomics
  • Bioinformatics tools for pathogen detection and identification
  • Identify reference datasets or databases suitable for pathogen detection

Lab Practical:
Pathogen identification using SURPI and Kraken (and similar tools)

Module 7: Data Visualization (2017) (Instructor: Anamaria Crisan)

  • Introduction to Summarizing Analysis Visually
  • Visualization tools for Genomic Epidemiological Data
    • IRIDA (tree visualization)
    • Microreact (tree and metadata visualization)
    • IslandViewer (genomic features viewer: genomic islands, genes, virulence factors, antimicrobial resistance genes etc.)
  • Various data visualization techniques using R and Shiny

Lab Practical:
Hands on tutorial of the tools covered

Open Access LogoCanadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.