workshop image
Course Description

With the introduction of high-throughput sequencing platforms, it is becoming feasible to consider sequencing approaches to address many research projects. However, knowing how to manage and interpret the large volume of sequence data resulting from such technologies is less clear. The CBW has developed a popular 3-day course covering the bioinformatics tools available for managing and interpreting high-throughput sequencing data, including both short- and long-read approaches.

Course Objectives

Beginning with an understanding of major sequencing technologies, participants will gain practical experience and skills to be able to:

  • Assess sequence quality
  • Map sequence data onto a reference genome
  • Identify variants, including single-nucleotide variants, indels, and structural variants
  • Perform de novo genome assembly and evaluate the quality and completeness of it. 
  • Integrate biological context with sequence information
Target Audience

This workshop is intended for graduate students, post-doctoral fellows, clinical fellows and investigators involved in analyzing data from high-throughput sequencing platforms.

Prerequisites

UNIX familiarity is required.

You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact support@bioinformatics.ca for more information.

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Introduction to High Throughput Sequencing (Dr. Erin Pleasance)

  • Overview of high-throughput sequencing technologies; major players and their strengths and weaknesses (Illumina, Nanopore, PacBio)
  • Accuracy, throughput, cost trends
  • Long- versus short-read sequencing technologies and when to use each
  • Other sequencing-based applications (full-length RNA, single-cell, etc)
  • Knowing your role in data management and data reproducibility

Module 2: Genome Visualization (Dr. Erin Pleasance)

  • Purposes and types of of visualization in genomics
  • Introduction to genomic data visualization tools and how they can be used to visualize sequencing read data: UCSC Genome Browser, Ensembl, IGV

Lab Practical: Variant detection and visualization within the genome using IGV

Module 3: De Novo Assembly (or “Putting it All Together) (Dr. Ido Hatam)

  • Fundamentals of de novo assembly
  • Data types for assembly
    • Fastq files structure
    • uBAM files ← mention since Erin might mention base mods with ONT
  • Steps for assembly
    • QC of fastq files short read vs long reads (will be nice to close circle with Erin talking about seq strategies since they present difference error profiles)
  • Overview of commonly used software

Lab Practical: Perform a de novo assembly task (From QC to de-novo assembly)

Module 4: Genome Alignment (Dr. Ido Hatam)

  • What is involved in mapping reads to a reference genome
  • T2T assemblies vs. current reference genomes
  • What are the SAM/BAM file formats
  • Some common terminology used to describe alignments

Lab Practical: Genome alignment exercise using genomic sequence data

Module 5: Small-Variant Calling and Annotation (Dr. Ido Hatam)

  • SNPs, SNVs, and short-INDELs what are they and why to look for them
    • SNP calling tools for long and short readsReads and why they matter
    • Aligners and why they matter
      • Expected output files and their format
    • Processing mapped reads
      • Recalibrate base quality score
      • Duplicate marking/removal 
  • Detecting variants and factors taken into account by the SNP callers
  • Different types of SNP calling: haploid/diploid, trio, somatic mutations, pooled
  • Determining which SNPS are good from the millions detected
  • INDEL cleaning
  • Standard file formats for SNPs
  • Introduction to SNP calling tools and how they compare with each other

Lab Practical: SNP detection exercise

Module 6: Structural Variation (Dr. Ido Hatam)

  • Structural variants (SVs), different types, mechanisms that give rise to SVs, and how SVs and CNVs differ
  • Differences between human and model organism genomes
  • Detecting SVs via sequencing (read pair, read depth, combined approach, local de novo assembly) and which SV types are detectable by which strategies
  • Introduction to SV detection tools
  • File formats used to describe SVs

Lab Practical: SV discovery in a single human genome and a brief intro to SV visualization and interpretation; perhaps compare short vs long read SV detection; explore an assembly summary report

Workshop Details:

Duration: 3 days

Start: Nov 04, 2026

End: Nov 06, 2026

Location: Vancouver, British Columbia Canada
Course Mode:

Status: Application Open

Apply
Offers:
CAD $864 for applications received between April 14, 2026 to September 4, 2026
CAD $1064 for applications received between September 5, 2026 to October 22, 2026
Limited to: 28 participants
Lead Instructors:
Open Access Content:

Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.

Funders
FunderLogoFunderLogoFunderLogoFunderLogoFunderLogoFunderLogo

Posted on: