This workshop is intended to provide an introduction to machine learning and its application to bioinformatics. This workshop is not intended for machine learning experts. Instead, it targets biologists or other life scientists who want to understand what machine learning is, what it can do and how it can be used for a variety of bioinformatic or medical informatics applications.
Students will gain experience in:
- Applications and Limitations of Machine Learning and Deep Learning
- Decision Trees and Random Forests – how they work, how they are coded in Python and R, and how they can be used in bioinformatic applications (biomarker discovery and modeling)
- Artificial Neural Networks (ANNs) – how they work, how data is encoded, how they are coded in Python and R, and how they can be used in bioinformatic applications (classification and secondary structure prediction)
- Hidden Markov Models (HMMs) – how they work, how they are coded in Python and R and how they can be used in bioinformatics applications (gene finding)
- Using Machine Learning tools (Decision Trees, ANNs and HMMs) on the Web (SciKit Learn and Keras/Colab)
Graduates, postgraduates, staff bioinformaticians and PIs working with or about to embark on using machine learning for bioinformatics applications.
Familiar with Linux or Unix operating systems, familiar with Python and/or R
You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact email@example.com for more information.
This workshop requires participants to complete pre-workshop tasks and readings.
- Introduction to machine learning – what it is, what it isn’t. What it can be used for, what it shouldn’t be used for
- Examples of machine learning in bioinformatics.
- Brief introduction to machine learning methods including:
- 1) artificial neural networks;
- 2) hidden markov models;
- 3) decision trees and random forests;
- 4) deep neural networks;
- Decision Trees and Random Forests – Details of how a simple Decision Tree (DT) works and how a Random Forest (RF) works. An example of using DTs in classification. Coding (Python and R) a simple DT to do classification. Example of RF in regression. Assessing model performance
- Artificial Neural Networks – Details of how a simple ANN works. The meaning of hidden layers. An ANN simulation. Coding (Python and R) a simple ANN to do classification. Assessing model performance
- More Artificial Neural Networks – Details of how an ANN works. The meaning of hidden layers. An ANN simulation. Coding (Python and R) a simple ANN to do secondary structure prediction. Assessing model performance
Module 5 and 6:
- Hidden Markov Models – Details of an HMM and how it works. An HMM Simulation. Coding (Python and R) a simple HMM to do gene prediction in prokaryotes. Assessing model performance
Module 7 and 8:
- Machine Learning on the Web – Introduction to SciKit Learn and Keras.
- Illustrating how “complex” code models introduced in Modules 2-6 can be done easily and with little code requirement using SciKit Learn and Colab
- Assessing model performance
- Work on your own and try different modules if interested
Duration: 2 days
Start: May 25, 2021
End: May 26, 2021
Course Mode: Online
Status: Registration Closed
Open Access Content:
Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.