Informatics and Statistics for Metabolomics

Workshop banner

Course Objectives

Using high-throughput technologies, life science researchers can identify and characterize all the small molecules or metabolites in a given cell, tissue, or organism. The CBW course covers many topics ranging from understanding metabolomics technologies, data collection and analysis, using pathway databases, performing pathway analysis, conducting univariate and multivariate statistics, working with metabolomic databases, and exploring chemical databases. Hands-on practical tutorials using various data sets and tools will assist participants in learning metabolomics analysis techniques.

Participants will gain practical experience and skills to be able to:

  • Design appropriate metabolome-focused experiments
  • Understand the advantages and limitations of metabolomic data analysis
  • Devise an appropriate bioinformatics workflow for processing and analyzing metabolomic data
  • Apply appropriate statistics to undertake rigorous data analysis
  • Visualize datasets to gain intuitive insights into the composition and/or activity of their metabolome

Target Audience

This course is intended for graduate students, post-doctoral fellows, clinical fellows and investigators who are interested in learning about both bioinformatic and cheminformatic tools to analyze and interpret metabolomics data.


You will also require your own laptop computer. Minimum requirements: 1024x768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, please contact for other possible options.

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Introduction to Metabolomics

  • Short history of metabolomics and metabolomes
  • Relationship between metabolomics and other “omics”
  • Principles of NMR, chromatography, and mass spectrometry
  • Targeted vs. non-targeted metabolomics

Module 2: Metabolite Identification and Annotation

  • Spectral deconvolution and its application to NMR, GC-MS, and LC-MS data
  • Introduction to software tools: Bayesil, AMDIS, GC-AutoFit, and XCMS
  • Introduction to MS databases and database searches: PubChem, ChEBI, CFM-ID, and NIST

Lab Practical: Compound ID and Quantification:

  • Perform metabolite ID and/or quantification using:
    • NMR data and Bayesil
    • GC-MS data and GC-AutoFit
    • LC-MS/MS data and XCMS Online
  • Explore results with Human Metabolome Database

Module 3: Databases for Chemical, Spectral, and Biological Data

  • Explore different database models and different kinds of metabolomic databases
  • Introduction to public spectral databases, pathway databases, and comprehensive metabolomic databases
  • Optional Exercises:
    • Identify and annotate metabolites using databases
    • Explore software tools and databases

Module 4: Backgrounder in Statistical Methods

  • Distributions and significance
  • Introduction to univariate (t-tests and ANOVA) and multivariate (PCA and PLS-DA) statistics
  • Correlation and clustering

Module 5: MetaboAnalyst

  • Standard metabolomics data analysis workflow
  • Introduction to MetaboAnalyst and its modules
    • Metabolomic data processing
    • Data reduction and statistical analysis
    • Metabolite Set Enrichment Analysis (MSEA)
    • Pathway analysis
    • Biomarker analysis

Lab Practical: Metabolomic Data Analysis using MetaboAnalyst 3.0:

  • Use MetaboAnalyst to analyze:
    • NMR-based metabolomic data
    • GC-MS-based metabolomic data
    • LC-MS/MS-based metabolomic data

Module 6: Future of Metabolomics (David Wishart)

  • Current and future challenges for metabolomics
  • Current and future challenges for "omics" studies