workshop image
Course Description

This workshop is intended to provide an introduction to machine learning and its application to bioinformatics. This workshop is not intended for machine learning experts. Instead, it targets biologists or other life scientists who want to understand what machine learning is, what it can do and how it can be used for a variety of bioinformatic or medical informatics applications.

Course Objectives

Students will gain experience in:

  • Applications and Limitations of Machine Learning and Deep Learning
  • Decision Trees and Random Forests – how they work, how they are coded in Python and R, and how they can be used in bioinformatic applications (biomarker discovery and modeling)
  • Artificial Neural Networks (ANNs) – how they work, how data is encoded, how they are coded in Python and R, and how they can be used in bioinformatic applications (classification and secondary structure prediction)
  • Large Language Models (LLMs) – how they work, and how they can be used in bioinformatics applications (text mining, information extraction)
  • Using Machine Learning tools (Decision Trees, ANNs and HMMs) on the Web (SciKit Learn and Keras/Colab)
Target Audience

Graduates, postgraduates, staff bioinformaticians and PIs working with or about to embark on using machine learning for bioinformatics applications.


Familiar with the basics of Python and/or R.

You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact for more information.

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Introduction to Machine Learning

  • What it is, what it isn’t. What it can be used for, what it shouldn’t be used for.
  • Examples of machine learning in bioinformatics
  • Brief introduction to machine learning methods including:
    • 1) artificial neural networks
    • 2) hidden markov models
    • 3) decision trees and random forests
    • 4) deep neural networks

Module 2: Decision Trees and Random Forests

  • Details of how a simple Decision Tree (DT) works and how a Random Forest (RF) works. An example of using DTs in classification. 
  • Coding (Python and R) a simple DT to do classification 
  • Example of RF in regression 
  • Assessing model performance

Module 3: Artificial Neural Networks

  • Details of how a simple ANN works 
  • The meaning of hidden layers 
  • An example ANN simulation
  • Coding (Python and R) a simple ANN to do classification 
  • Assessing model performance

Module 4: Artificial Neural Networks for Secondary Structure Prediction

  • Coding (Python and R) a simple ANN to do secondary structure prediction 
  • Assessing model performance

Module 5: Machine Learning for Gene Prediction

  • Coding (Python and R) a simple ANN and HMM to do gene prediction in prokaryotes 
  • Assessing model performance

Module 6: Machine Learning on the Web

  • Introduction to SciKit Learn and Keras
  • Illustrating how “complex” code models introduced in Modules 2-5 can be done easily and with little code requirement using SciKit Learn and Colab
  • Assessing model performance
  • Work on your own and try different modules

Module 7: Large Language Models and Bioinformatics

  • Introduction to ChatGPT and LLMs
  • Example applications of LLMs in information extraction and text mining
  • Setting up and using LLMs for bioinformatics applications
Workshop Details:

Duration: 2 days

Start: Sep 07, 2024

End: Sep 08, 2024

Location: Virtual
Course Mode:

Status: Application Open

CAD $495 for applications received between February 7, 2024 to August 7, 2024
CAD $695 for applications received between August 8, 2024 to August 24, 2024
Limited to: 40 participants
Lead Instructors:
Open Access Content:

Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.


Posted on: