Infectious Disease Genomic Epidemiology

Course Objectives
With increasing adoption of Next Generation Sequencing technologies to infectious disease surveillance and outbreak investigations, genomic epidemiology (combining pathogen genomics data with epidemiological investigations to track the spread of infectious diseases) is poised to change the practices of public health and infection controls and provides unprecedented amount of data for pathogen evolution studies.
The CBW has developed a 3-day course providing an introduction to genomic epidemiology analysis followed by hands-on practical tutorials demonstrating the use selected analysis tools. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools or access to publicly available web applications.
Participants will gain practical experience and skills to be able to:
- Understand next generation sequencing (NGS) platforms as applied to pathogen genomics and metagenomics sequencing
- Analyze NGS data for pathogen surveillance and outbreak investigations
- Analyze antimicrobial resistance genes
- Detect emerging pathogens in metagenomics data
- Perform phylogeographic analysis
- Use different visualization tools for genomic epidemiology analysis
Target Audience
Graduates, postgraduates, staff bioinformaticians, laboratory technologists, medical microbiologists and PIs working with or about to embark on analysis of genomic and metagenomics data for epidemiological investigations.
Prerequisites: Basic familiarity with Linux environment and S, R, or Matlab. Must be able to complete and understand the following simple Linux and R tutorials (up to and including “Descriptive Statistics”) before attending:
- UNIX Tutorial (up to and including Tutorial Four) [http://www.ee.surrey.ac.uk/Teaching/Unix/]
- Quick & Dirty Guide to R [http://ww2.coastal.edu/kingw/statistics/R-tutorials/text/quick&dirty_R.txt]
You will also require your own laptop computer. Minimum requirements: 1024x768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements).If you do not have access to your own computer, you may loan one from the CBW. Please contact course_info@bioinformatics.ca for more information.
Pre-work and pre-readings can be found at https://bioinformaticsdotca.github.io/epidemiology_2018.
Course Material
-
Module 1: Introduction to Public Health Microbiology and Genomic Epidemiology (Will Hsiao)
Instructor(s): Will Hsiao
Content:
- Review of relevant terms and concepts
- Review of next generation sequencing and its application to microbiology
- Overview of sequence data processing
- The importance of metadata quality and curation
- Overview of different types of genomic epidemiology analysis
- Computing Resources and Requirements
- Setting up Amazon Web Service
-
Module 2: Pathogen Genomic Analysis I (Single Nucleotide Variants) (Gary Van Domselaar)
Instructor(s): Gary Van Domselaar
Content:
- Review of phylogenetics
- Overview of single-nucleotide-variants (SNVs)
- Whole genome SNVs analysis using NGS data
- Reference-based vs. Reference-free SNVs analysis
- Advantages and complications associated with SNVs analysis
- Integration of epidemiological data for SNVs analysis
Presentation file(s):
PPTLab Practical
Instructors(s): Gary Van Domselaar
Content:
- Building and interpreting phylogenetic trees using SNVs SNVPHYL pipeline (manuscript in preparation) available in IRIDA platform will be used for this exercise; additional command lines tools will also be covered)
Presentation file(s):
PDF -
Module 3: Pathogen Genomic Analysis II - marker genes based
Instructor(s): Dillon Baker
Content:
- Overview of Multi-locus Sequence Typing (MLST) and Whole-Genome or Core-Genome (WG/CG) -MLST analysis
- Concept of Nomenclature database and global surveillance
- Bacterial typing
- Publicly available MLST and WG/CG-MLST databases
- Assembly-based vs. Assembly-free MLST analysis
- Integration of epidemiological data for MLST analysis
Presentation file(s):
PDFLab Practical
Instructors(s): Dillon Baker
Content:
- Building and interpreting Cladograms and Molecular Typing using WG/CG-MLST
Presentation file(s):
PDF -
Module 4: Antimicrobial Resistant Gene (AMR) Analysis (Andrew McArthur)
Instructor(s): Andrew McArthur
Content:
- Review of available antimicrobial resistant (AMR) resources
- The Comprehensive Antimicrobial Resistance Database (CARD) Overview
- Annotation of AMR using ResFams and Active Sites annotation
- Identification of antimicrobial resistance genes
- Challenges of Detecting AMR in Metagenomics
Presentation file(s):
PDFLab Practical
Instructors(s): Andrew McArthur
Content:
- Using CARD website
- Using Resistance Gene Identifier (RGI) and other bioinformatics tools to identify and characterize AMR genes
Presentation file(s):
PDF -
Module 5: Phylogeographic Analysis (Anamaria Crisan)
Instructor(s): Anamaria Crisan
Content:
- Overview of Phylogeographic analysis
- Introduction to GenGIS (http://kiwi.cs.dal.ca/GenGIS/Main_Page)
- Introduction to Phylocanvas (http://phylocanvas.org/)
Presentation file(s):
PDFLab Practical
Instructors(s): Anamaria Crisan
Content:
- Using IRIDA and GenGIS to process genomic data and geographic (location) data for outbreak detection, analysis, and visualization.
Presentation file(s):
PDF -
Module 6: Emerging Pathogen Detection and Identification using Metagenomic Samples (Gary Van Domselaar)
Instructor(s): Gary Van Domselaar
Content:
- Overview of Pathogen Identification using metagenomics
- Bioinformatics tools for pathogen detection and identification
- Identify reference datasets or databases suitable for pathogen detection
Presentation file(s):
PDFLab Practical
Instructors(s): Gary Van Domselaar
Content:
- Pathogen identification using SURPI and Kraken (and similar tools)
Presentation file(s):
PDF -
Module 7: Data Visualization Lecture (Anamaria Crisan)
Instructor(s): Anamaria Crisan
Content:
- Introduction to Summarizing Analysis Visually
- {"title"=>"Visualization tools for Genomic Epidemiological Data", "content"=>["IRIDA (tree visualization)", "Microreact (tree and metadata visualization) – potentially also covered in Module 5", "IslandViewer (genomic features viewer: genomic islands, genes, virulence factors, antimicrobial resistance genes etc.)"]}
- Various data visualization techniques using R and Shiny
Presentation file(s):
PDFLab Practical
Instructors(s): Anamaria Crisan
Content:
- Hands on tutorial of the tools covered
Presentation file(s):
PDF