Summary and Setup

Analyzing the association between gene expression and genetic variants is known as expression quantitative trait locus (eQTL) mapping. eQTL mapping searches for associations between the expression of one or more genes and a genetic locus. Specifically, genetic variants underlying eQTL peak explain some of the variation in gene expression levels. eQTL studies can reveal the architecture of quantitative traits, connect DNA sequence variation to phenotypic variation, and shed light on transcriptional regulation and regulatory variation. Traditional analytic techniques like linkage and association mapping can be applied to thousands of gene expression traits (transcripts) in eQTL analysis, such that gene expression can be mapped in much the same way as a physiological phenotype like blood pressure or heart rate. Joining gene expression and physiological phenotypes with genetic variation can identify genes with variants affecting disease phenotypes.

Software Setup


R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.

  1. Install the latest version of R from CRAN.

  2. Install the latest version of RStudio. Choose the free RStudio Desktop version for Windows, Mac, or Linux.

  3. Start RStudio.

  4. Install R and Bioconductor packages.

R

install.packages(c("tidyverse", "ggbeeswarm", "knitr", "qtl2", "remotes"))

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install(c("AnnotationHub", "DESeq2", "qvalue",  "rtracklayer", "sva"))

remotes::install_github("churchill-lab/intermediate")

Once the installation is complete, load the libraries to make sure that they installed correctly.

R

library(tidyverse)
library(ggbeeswarm)
library(knitr)
library(intermediate)
library(qtl2)
library(AnnotationHub)
library(DESeq2)
library(qvalue)
library(sva)
library(rtracklayer)

If the libraries don’t load and you received errors during the installation, please contact the workshop instructors before the workshop to help you.

Project organization


  1. Create a new project in your Desktop called eqtl_mapping.
  • Click the File menu button, then New Project.
  • Click New Directory.
  • Click New Project.
  • Type eqtl_mapping as the directory name. Browse to your Desktop to create the project there.
  • Click the Create Project button.
  1. Use the Files tab to create a data folder to hold the data, a scripts folder to house your scripts, and a results folder to hold results. Alternatively, you can use the R console to run the following commands for step 2 only. You still need to create a project with step 1.

R

dir.create("./data")
dir.create("./scripts")
dir.create("./results")

Data Sets


For this course, we will have several data files which you will need to download to the data directory in the project folder on your Desktop.

  1. Download files from the Github lesson repository. You will need to download them one by one using the direct links below. For each file, locate the download button at upper right.
Graphic showing the download button at right on the Github data file page
Select the download button

Repeat this process for each file. Then move the files from wherever your downloads go (e.g. Downloads) to the data directory in the eqtl_mapping project. You can use a graphical user interface (e.g. Windows File Explorer, Mac Finder) to move the files.

  1. physiological phenotypes

  2. phenotype dictionary

  3. covariates

  4. gene annotations

  5. raw gene expression

  6. map

  7. kinship

  8. eQTL peaks

  9. insulin permutations

  10. Hnf1b permutations

  11. chromosome 11 insulin blups

  12. chromosome 11 Hnf1b blups

  13. Copy, paste, and run the following code in the RStudio console to download the genotype probabilities for the gene expression study we will explore in this lesson.

R

download.file(url      = "https://thejacksonlaboratory.box.com/shared/static/4hy4hbjyrxjbrzh570i4g02r62bx3lgk.rds",
              destfile = "data/attie_DO500_genoprobs_v5.rds",
              mode     = "wb")