Clinical Genomics and Biomarker Discovery Workshop

Workshop Details

Toronto Date: September 23-24, 2010 in Downtown Toronto, ON
Lead Faculty (2010): Sohrab Shah & Anna Lapuk
Registration Fee for Applications received before August 27, 2010: $500 + HST
Registration Fee for Applications received after August 27, 2010: $700 + HST
Apply now!



Target Audience
Researchers or clinicians who are interrogating clinical samples with high-throughput genomic assays for the purpose of determining clinically important molecular features.

Prerequisite: Your own laptop computer. Minimum requirements: 1024x768 screen
resolution, 1.5GHz CPU, 1GB RAM, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 2-3 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW for a fee. Please contact course_info@bioinformatics.ca for more information.

Also required is some practical experience or pre-reading tutorials in the R statistical package. Knowledge and experience with high dimensional genomic data sets such as gene expression, genotyping arrays, array CGH, next generation sequencing.

Course Objectives
The fields of medicine and molecular biology have converged to the extent where the concept of genomic medicine is being realized. Molecular research on clinical samples is expanding at a furious pace in the post human genome era as new enabling technologies are developed. In this course we will explore the analytical principles and practical tools of determining clinically relevant biomarkers from high dimensional genomic assays on clinical samples. Concepts and hands on labs will be demonstrated through landmark case studies from the literature and publicly available data.


Course Outline
A comprehensive lecture and laboratory manual will be provided.

Day 1
Introduction - Ice breaker and getting to know the group

Module 1: Introduction to biomarkers (Faculty: Sohrab Shah)
• Introduction: What is a biomarker?
• Measurement technologies that yield data on types of human variation

  • High density genotyping arrays
  • Next generation sequence data
  • Gene expression
  • Cytogenetics
  • PCR based point mutation analysis

• Security and privacy

  • Identifiability from genomic samples: challenges and solutions

• Introduction to feature selection and analysis of high dimensional data sets

Module 2: Identifying candidate biomarkers from a high-dimensional data set (Faculty: Shah)
• The curse of dimensionality, feature selection (presentation of concepts)
• Dealing with high dimensional data with small sample sizes (lecture and hands on lab with a gene expression data set)
• Experimental design in a clinical setting to avoid over-fitting

  • Discovery cohort
  • Validation cohort
  • Cross validation

• Supervised vs unsupervised clustering and classification
• Illustration and practical examples through landmark studies in the literature
• Constructing predictive models

Day 2

Module 3 Part 1 (Lecture): Introduction to clinical variables and ‘personalized medicine’ (Faculty: Anna Lapuk)
• Concepts and case studies of predictive of personalized medicine from the literature:

  • Cancer genetics
  • Pharmacogenomics
  • Genome wide association studies

• Correlating clinical outcomes with genomic data

  • Challenges with integration of heterogeneous data types (clinical vs genomics)
  • Survival analysis (univariate and multivariate)
  • Kaplan Meier cumulative distribution functions
  • Cox proportional hazards models
  • Log-rank test


Module 3 Part 2 (Lecture): Introduction to statistical and bioinformatic tools (Faculty: Lapuk)
• Concepts of R and its capabilities. Short description of modules necessary for the biomarker studies
• Alternative bioinformatic tools (freeware and low cost commercial software)

  • GenePattern freeware (Broad MIT)
  • MedCalc commercial software


Module 4 (Lab): From Genomics to Clinical Genomics (Faculty: Lapuk)
• Extension of clustering and classification methods to integration of clinical variables
• Using the gene expression data set from Day1 and correlating with clinical outcomes
• Plotting survival curves and testing for differences between clinical groups (univariate and multivariate) using R and MedCalc

Course Preparation
• Download data sets to be used in the Lab (TBD)
• Install R statistical package. Install BioConductor and survival packages.
• GenePattern installation (info to come)
• MedCalc Free trial installation (info to come)

Pre-Readings
• Module 1:

• Module 2: Cancer Cell. 2006 Dec;10(6):529-41 Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Chin K et al.

• Module 3:

• Module 4: An Introduction to R