Summary and Setup
Analyzing the association between gene expression and genetic variants is known as expression quantitative trait locus (eQTL) mapping. eQTL mapping searches for associations between the expression of one or more genes and a genetic locus. Specifically, genetic variants underlying eQTL peak explain some of the variation in gene expression levels. eQTL studies can reveal the architecture of quantitative traits, connect DNA sequence variation to phenotypic variation, and shed light on transcriptional regulation and regulatory variation. Traditional analytic techniques like linkage and association mapping can be applied to thousands of gene expression traits (transcripts) in eQTL analysis, such that gene expression can be mapped in much the same way as a physiological phenotype like blood pressure or heart rate. Joining gene expression and physiological phenotypes with genetic variation can identify genes with variants affecting disease phenotypes.
Software Setup
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Install the latest version of R from CRAN.
Install the latest version of RStudio. Choose the free RStudio Desktop version for Windows, Mac, or Linux.
Start RStudio.
Install R and Bioconductor packages.
R
install.packages(c("tidyverse", "ggbeeswarm", "knitr", "qtl2", "remotes"))
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("AnnotationHub", "DESeq2", "qvalue", "rtracklayer", "sva"))
remotes::install_github("churchill-lab/intermediate")
Once the installation is complete, load the libraries to make sure that they installed correctly.
R
library(tidyverse)
library(ggbeeswarm)
library(knitr)
library(intermediate)
library(qtl2)
library(AnnotationHub)
library(DESeq2)
library(qvalue)
library(sva)
library(rtracklayer)
If the libraries don’t load and you received errors during the installation, please contact the workshop instructors before the workshop to help you.
Project organization
- Create a new project in your Desktop called
eqtl_mapping
.
- Click the
File
menu button, thenNew Project
. - Click
New Directory
. - Click
New Project
. - Type
eqtl_mapping
as the directory name. Browse to your Desktop to create the project there. - Click the
Create Project
button.
- Use the
Files
tab to create adata
folder to hold the data, ascripts
folder to house your scripts, and aresults
folder to hold results. Alternatively, you can use the R console to run the following commands for step 2 only. You still need to create a project with step 1.
R
dir.create("./data")
dir.create("./scripts")
dir.create("./results")
Data Sets
For this course, we will have several data files which you will need
to download to the data
directory in the project folder on
your Desktop.
- Download files from the Github lesson repository. You will need to download them one by one using the direct links below. For each file, locate the download button at upper right.
Repeat this process for each file. Then move the files from wherever
your downloads go (e.g. Downloads
) to the
data
directory in the eqtl_mapping
project.
You can use a graphical user interface (e.g. Windows File
Explorer, Mac Finder) to move the files.
Copy, paste, and run the following code in the RStudio console to download the genotype probabilities for the gene expression study we will explore in this lesson.
R
download.file(url = "https://thejacksonlaboratory.box.com/shared/static/4hy4hbjyrxjbrzh570i4g02r62bx3lgk.rds",
destfile = "data/attie_DO500_genoprobs_v5.rds",
mode = "wb")