
Python is an incredibly powerful programming language used heavily by bioinformaticians. Many of its libraries are useful for bioinformatics analysis, such as Biopython, Pandas and Matplotlib. In this workshop, the use of these specialized libraries is covered to process and visualize real-world bioinformatics data through hands-on activities. Participants will learn to design and implement solutions to analyze bioinformatics datasets enabling them to apply these skills to their own research projects.
After the course, participants should be able to:
- Use the specialized Python libraries NumPy, Pandas, Matplotlib, Biopython and Scikit-bio
- Apply these skills to processing and analyzing bioinformatics datasets
- Understand and use programming best practices
- Design and implement bioinformatics solutions using the Python programming language and the specialized libraries covered in this workshop
Graduates, postgraduates, and PIs who are planning to design and execute strategies for data analysis but have familiarity with Python programming language. This workshop builds on the foundation from Introduction to Python; total beginners are encouraged to take both workshops together.
Basic familiarity with Python, including syntax, variables, and data types. These concepts will be covered in the Introduction to Python workshop preceding Analysis.
You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements).
This workshop requires participants to complete pre-workshop tasks and readings.
Module 1: Knowledge check/programming basics
- Built-in functions calls, importing modules and calling methods
- Control statements: Conditions and loops
- Function definition
Module 2: NumPy, Pandas
- Intro to Matplotlib plotting
- NumPy fundamentals
- Pandas data structures: Series and DataFrames
- Pandas operations
Lab Practical Applying Pandas to explore and manipulate a biomedical dataset
Module 3: Matplotlib and Seaborn
- Simple plotting with Matplotlib and Seaborn
- Plots of two variables
- Pair plots
- Plotting subplots
Lab practical Create plots for visual bioinformatics data analysis using Matplotlib and Seaborn
Lab Project
- Practice use of Pandas, Matplotlib and Seaborn for data manipulation and visualization
- Use Pandas to load a biomedical dataset and manipulate the data to prepare it for the creation of visualizations/plots that support analyses and decision making
- Trainees will familiarize with the use of Pandas and the visualization libraries to apply these tools in their own projects.
Module 4: BioPython
- Sequences in Biopython
- Annotations, locations and features
- Bio.SeqIO
Lab practical Handling sequence data with Biopython
Module 5: Scikit-bio
- Basics for bioinformatics with Scikit-bio
- Embeddings and Vectors
- Working with multiple omic data types
- Prediction with Scikit-bio
Lab practical Prediction example using Scikit-bio
Lab Project
- Using Biopython and Scikit-bio to implement a bioinformatics workflow
- Trainees will load and manipulate a biomedical dataset (DNA seq) to create a simple prediction task from the input data
- Trainees will reinforce and check their understanding on the libraries covered in the workshop
Duration: 2 days
Start: Nov 05, 2025
End: Nov 06, 2025
Status: Application Open
ApplyCanadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.
Posted on: