workshop image

Course Description

Participants will gain practical experience and skills to be able to use R to visualize and investigate patterns in their data.

Target Audience

Graduates, postgraduates, and PIs who design and execute strategies for data analysis and who are using the R statistical workbench.

Prerequisites

You are expected to be a regular user of R. If you do not regularly use R, please begin by taking the Introduction to R workshop.

You will require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact support@bioinformatics.ca for more information.

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Exploratory data analysis Overview & Clustering

  • Knowing your data: An overall workflow for exploratory data analysis
    • Understand the difference between response variables, explanatory variables, biological variation, technical variation, and batch effects
    • Missing data; understand how to identify structured versus unstructured missingness, and the role of imputation
    • Finding unwanted sources of variation; surrogate variable analysis and RUVseq
  • Knowing your data’s structure. Calculating “distance” between (high-dimensional) data points
    • What distance metrics represent
    • Different kinds of different metrics and when to use them
  • Clustering principles & methods
    • Why cluster?
    • A survey of clustering methods
    • Choose the clustering method that is right for your data
  • Assessing the quality of clustering results
    • Metrics for identifying the optimal number of clusters
    • Existential questions introduced by clustering

Module 2: Dimensionality reduction

  • What is dimensionality reduction, and common applications in bioinformatics
  • Dimensionality reduction with Principal Components Analysis (PCA)
    • Conduct PCA on different types of data
    • Get information out of PCA objects in R 
  • Some practical uses of PCA 
    • Plot and learn from PCA output
    • Use PCs as control variables in your analysis
    • Use PCs as variables of interest in your analysis
  • Other types of dimensionality reduction
    • t-stochastic neighbor embedding (tSNE) 
    • uniform manifold approximation and project (UMAP)

Module 3: Fitting generalized linear models

  • Read different data files into R
  • Merge data and handle missing values
  • Use ggplot to create and modify publication-quality R plots
  • Plot and fit linear model for continuous-valued outcome, and logistic model for dichotomous outcome

Module 4: Differential expression analysis

  • Manually conduct many parallel statistical tests
    • Different types of statistical tests
    • Evaluate and plot output 
    • Extract output for tables 
    • Visualize p-values from multiple testing: QQplot, volcano plot.
    • Correct for multiple statistical tests: Bonferroni, false discovery rate
  • Using bioconductor for analysis
    • Perform differential expression analysis

Workshop Details:

Duration: 2 days

Start: Jun 28, 2023

End: Jun 29, 2023

Location: Toronto, Ontario, Canada

Course Mode: Onsite

Status: Registration Closed

Workshop Started

Offers:

CAD $475 for applications received between February 1, 2023 to April 28, 2023
CAD $675 for applications received between April 29, 2023 to June 14, 2023
Limited to: 30 participants

Lead Instructors:

Open Access Content:

Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.

Funders

FunderLogoFunderLogoFunderLogoFunderLogo

Posted on: