R is one of the most important scripting languages for both experimental and computational biologists. It is well-designed, efficient, widely adopted and has a very large base of contributors who add new functionality for all modern aspects of data analysis and visualization. Moreover, it is free and open source. However, R’s great power and expressively can at first be difficult to approach without guidance, especially for those who are new to programming. This workshop introduces the essential ideas and tools of R. Although this workshop will cover running statistical tests in R, it does not cover statistical concepts.
Participants will gain practical experience and skills to be able to:
- Meet the challenges of data handling
- Break down problems into structured parts
- Use R syntax, functions and packages
Graduates, postgraduates, and PIs who design and execute strategies for data analysis but have little or no familiarity with the R statistical workbench. This workshop is designed to lead on to the two-day workshop on Exploratory Data Analysis, which follows it.
You will also require your own laptop computer. Minimum requirements: 1024×768 screen resolution, 1.5GHz CPU, 2GB RAM, 10GB free disk space, recent versions of Windows, Mac OS X or Linux (Most computers purchased in the past 3-4 years likely meet these requirements). If you do not have access to your own computer, you may loan one from the CBW. Please contact support@bioinformatics.ca for more information.
This workshop requires participants to complete pre-workshop tasks and readings.
Module 1: Getting to Know R
- The environment and the user interface
- How to get help and where to find information
- Syntax and language principles
- Data types: numbers, time and factors, strings and text
- Data classes: vectors, matrices, lists, dataframes and hashes
- Reading data into the R environment
- Data format considerations
- Accessing your data once it’s in R
- Manipulating data in R
- Subsetting (slicing, filtering and reshaping)
- Base R approaches vs plyr and dplyr
- The bioconductor project + other cool packages
Module 2: Exploring your data in R
- Creating base R plots
- Scatter plot, line plot, histogram, boxplot, bar plot
- Tweaking your plots
- Changing colors, sizes, labels, legends, and more
- Multi-panel plots and changing R plotting dimensions
- A quick look at some specialized plotting packages
- Manhattan plots, circos plots, volcano plots, lolliplot, genome browser plot
Module 3: Analyzing your data in R
- How to run different types of models in R
- R data-type considerations
- Different types of models
- Packages and functions for more complex models
- Accessing and using model output in R
- Exploring model output
- Exporting model output
- Plotting model output
- Using model output to perform model diagnostics
Module 4: Optimizing your code (organization and time)
- Functions: why and how?
- Loops and vectorized operations
- Debugging review
Duration: 2 days
Start: Jun 21, 2021
End: Jun 22, 2021
Status: Registration Closed
Workshop Ended
Canadian Bioinformatics Workshops promotes open access. Past workshop content is available under a Creative Commons License.
Posted on: