workshop image
Course Description

This workshop is intended to provide an introduction to machine learning and its application to bioinformatics. This workshop is not intended for machine learning experts. Instead, it targets biologists or other life scientists who want to understand what machine learning is, what it can do and how it can be used for a variety of bioinformatic or medical informatics applications.

Course Objectives

Students will gain experience in:

  • Applications and Limitations of Machine Learning and Deep Learning
  • Decision Trees and Random Forests – how they work, how they are coded in Python and R, and how they can be used in bioinformatic applications (biomarker discovery and modeling)
  • Artificial Neural Networks (ANNs) – how they work, how data is encoded, how they are coded in Python and R, and how they can be used in bioinformatic applications (classification and secondary structure prediction)
  • Hidden Markov Models (HMMs) – how they work, how they are coded in Python and R and how they can be used in bioinformatics applications (gene finding)
  • Using Machine Learning tools (Decision Trees, ANNs and HMMs) on the Web (SciKit Learn and Keras/Colab)
Target Audience

Graduates, postgraduates, staff bioinformaticians and PIs working with or about to embark on using machine learning for bioinformatics applications.

Prerequisites

Familiar with the basics of Python and/or R.

You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements).

This workshop requires participants to complete pre-workshop tasks and readings.

Course Outline

Module 1: Introduction to Machine Learning

  • What it is, what it isn’t. What it can be used for, what it shouldn’t be used for.
  • Examples of machine learning in bioinformatics
  • Brief introduction to machine learning methods including:
    • 1) artificial neural networks
    • 2) hidden markov models
    • 3) decision trees and random forests
    • 4) deep neural networks

Module 2: Decision Trees and Random Forests

  • Details of how a simple Decision Tree (DT) works and how a Random Forest (RF) works. An example of using DTs in classification. 
  • Coding (Python and R) a simple DT to do classification 
  • Example of RF in regression 
  • Assessing model performance

Module 3: Artificial Neural Networks

  • Details of how a simple ANN works 
  • The meaning of hidden layers 
  • An example ANN simulation
  • Coding (Python and R) a simple ANN to do classification 
  • Assessing model performance

Module 4: Artificial Neural Networks for Secondary Structure Prediction

  • Coding (Python and R) a simple ANN to do secondary structure prediction 
  • Assessing model performance

Module 5: Gene Prediction with NNs

  • Review the structure of prokaryotic genes
  • Explain how prokaryotic gene prediction or identification can be done and how gene prediction is assessed
  • Review and explain the python code for different ANNs to predict prokaryotic genes
  • Use Colab to explore gene prediction code

Modules 6/7: Machine Learning with Keras and Scikit-learn

  • Introduce scikit-learn and Keras, compare to numpy
  • Show how the Iris classification problem with decision trees can be simplified using scikit-learn (sklearn)
  • Show how the ANN Iris classifier can be coded using Keras and sklearn
  • Use Colab to explore different code sets

Module 8: Information Extraction with ChatGPT

  • Introduction to large language models (LLMs)
  • Introduction to ChatGPT
  • Introduction to information extraction
  • Using ChatGPT for coding
  • Getting the ChatGPT API
  • Lab: Examples of query engineering and information extraction with ChatGPT
Workshop Details:

Duration: 2 days

Start: Aug 16, 2023

End: Aug 17, 2023

Location:
Course Mode:

Status: Registration Closed

Workshop Ended

Offers:
CAD $285 for applications received between February 1, 2023 to July 16, 2023
CAD $385 for applications received between July 17, 2023 to August 2, 2023
Limited to: 40 participants
Lead Instructors:
Open Access Content:

Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.

Funders
FunderLogoFunderLogoFunderLogoFunderLogoFunderLogo

Posted on: