Oct 12, 2018
9:00 am - 4:30 pm
Instructors: Sue McClatchy, Asli Uyar, Dan Gatti
Helpers: Yuka Takemon, Duy Pham
This workshop is open to those who have met the prerequisite by taking a 2-day R workshop or otherwise being competent in R. The workshop is open to those at the Jackson Laboratory and neighboring institutions.
Where: Breezeway Bioinformatics Training Room, Bldg 1, Room 1540, 600 Main Street, Bar Harbor, Maine. Get directions with OpenStreetMap or Google Maps.
When: Oct 12, 2018. Add to your Google Calendar.
Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Code of Conduct.
Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:
Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.
Contact: Please email susan.mcclatchy@jax.org for more information.
Please be sure to complete these surveys before and after the workshop.
Before | Pre-workshop survey |
09:00 | Welcome and Introductions |
09:15 | Introduction to Bioconductor |
10:15 | RNA-seq data analysis with DESeq2 |
10:45 | Coffee |
11:00 | RNA-seq data analysis with DESeq2 (continued) |
12:30 | Lunch break |
13:30 | Solving common bioinformatic challenges using GenomicRanges |
14:45 | Coffee |
15:00 | Introduction to Bioconductor annotation |
15:45 | Public data resources and Bioconductor |
16:45 | Wrap-up |
END | Post-workshop survey |
We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.
To participate in a workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.
We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio. If you already have R and RStudio installed on your machine, please upgrade to the latest versions of each.
Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.
Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio IDE.
You can download the binary files for your distribution
from CRAN. Or
you can use your package manager (e.g. for Debian/Ubuntu
run sudo apt-get install r-base
and for Fedora run
sudo dnf install R
). Also, please install the
RStudio IDE.
Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development. The current release of Bioconductor is version 3.7; it works with R version 3.5.0. Users of older R and Bioconductor must update their installation to take advantage of new features and to access packages that have been added to Bioconductor since the last release.
Packages available in Bioconductor are summarized at https://bioconductor.org/packages.
The widget on the left summarizes four distinct types of Bioconductor packages:
1) software, 2) annotation, 3) experiment data, and 4) workflow.
Like CRAN (R) packages, Bioconductor packages need to be installed only once per R installation,
and then attached to each session where they are going to be used. Bioconductor packages are installed
slightly differently from CRAN packages. The first step is to install the BiocManager
package from CRAN.
Open RStudio, then copy and paste the following code into the console:
if (!"BiocManager" %in% rownames(installed.packages()))
install.packages("BiocManager", repos="https://cran.r-project.org")
The next step is to install the desired Bioconductor packages.
The syntax to install the packages is
BiocManager::install(c("rtracklayer", "GenomicRanges", "SummarizedExperiment", "DESeq2", "tximportData", "airway", "apeglm", "AnnotationHub", "ReportingTools", "Glimma", "splatter"))
A convenient function in BiocManager
is available()
, which accepts a regular expression to find matching packages. The following finds all TxDb
packages (describing exon, transcript, and gene coordinates).
BiocManager::available("TxDb")
Use the BiocManager::install()
function above to install UCSC known genes for human hg38 and mouse mm10.
BiocManager::install(c("TxDb.Hsapiens.UCSC.hg38.knownGene", "TxDb.Mmusculus.UCSC.mm10.knownGene"))
Bioconductor packages tend to depend on one another quite alot,
so it is important that the correct versions of all packages are installed.
Validate your installation with
BiocManager::valid()
In addition to the Bioconductor packages named above, we'll use some of the
R packages from tidyverse
. Run the following code in the console,
or install packages from the RStudio Packages tab.
install.packages("tidyverse")
bioconductor
.data
folder to hold the data, a scripts
folder to house your scripts, and a results
folder to hold results.
setwd("~/Desktop")
dir.create("./bioconductor")
setwd("~/Desktop/bioconductor")
dir.create("./data")
dir.create("./scripts")
dir.create("./results")
Please download the following large files before the workshop, and place them in your data
folder. You can download the files from the URLs below and move the files the same way that you would for downloading and moving any other kind of file.