Quantitative Trait Mapping: Setup


R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.

  1. Install the latest version of R from CRAN.

  2. Install the latest version of RStudio here. Choose the free RStudio Desktop version for Windows, Mac, or Linux.

  3. Start RStudio. The qtl2 package contains code for haplotype reconstruction, QTL mapping and plotting. Install qtl2 by copying and pasting the following code in the R console.

install.packages("qtl2", repos="http://rqtl.org/qtl2cran")

Make sure that the installation was successful by loading the qtl2 library, either by copy-paste to the Console or by checking the box next to qtl2 in the RStudio Packages tab. You shouldn’t get any error messages.


Data files and project organization

  1. Create a new project in your Desktop called mapping.
    • Click the File menu button, then New Project.
    • Click New Directory.
    • Click New Project.
    • Type mapping as the directory name. Browse to your Desktop to create the project there.
    • Click the Create Project button.
  2. Use the Files tab to create a data folder to hold the data, a scripts folder to house your scripts, and a results folder to hold results. Alternatively, you can use the R console to run the following commands for step 2 only. You still need to create a project with step 1.
  1. Please download the following large files before the workshop, and place them in your data folder. You can download the files from the URLs below and move the files the same way that you would for downloading and moving any other kind of data.

Alternatively, you can copy and paste the following into the R console to download the data.

options(timeout=900) # set the download timeout to 900 seconds from the default 60 seconds to help with large file downloads
# these four commands download files from a url and places them in the data directory you created
download.file(url="https://ndownloader.figshare.com/files/18533342", destfile="./data/cc_variants.sqlite") 
download.file(url="https://ndownloader.figshare.com/files/24607961", destfile="./data/mouse_genes.sqlite")
download.file(url="https://ndownloader.figshare.com/files/24607970", destfile="./data/mouse_genes_mgi.sqlite")
download.file(url="ftp://ftp.jax.org/dgatti/qtl2_workshop/qtl2_demo.Rdata", destfile="./data/qtl2_demo.Rdata")
options(timeout=60) # reset the download timeout to the default 60 seconds

# for Windows machine, add the argument mode=wb to the download.file() command
# for example
download.file(url="ftp://ftp.jax.org/dgatti/qtl2_workshop/qtl2_demo.Rdata", destfile="./data/qtl2_demo.Rdata", mode = "wb")

You will need these for the final lesson episodes on SNP association mapping and QTL analysis in Diversity Outbred mice.

Make sure that both the SNP and gene files downloaded correctly by running the following code. If you get an error, use getwd() to check the file path (e.g. "~/Desktop/mapping/data/cc_variants.sqlite") carefully or download the files again. Make sure to use setwd() to change the file path to the location where you saved the file.

Check part of the SNP file. It is a very large file, so checking only a sample of the file should do.

# create a function to query the SNP file, then use this new function  
# to select SNPs on chromosome 1 from 10 to 11 Mbp
snp_func = create_variant_query_func(dbfile = "~/Desktop/mapping/data/cc_variants.sqlite") 
snps = snp_func(chr = 1, start = 10, end = 11) 

# check the dimensions of this sample of the SNP file

You should get a result that is 13150 rows by 16 columns.

Check the gene file in the same way.

# create a function to query the gene file, then select genes in the same region as above
gene_func = create_gene_query_func(dbfile = "~/Desktop/mapping/data/mouse_genes_mgi.sqlite") 
genes = gene_func(chr = 1, start = 10, end = 11) 
dim(genes) # check the dimensions

You should get a result that is 18 rows by 15 columns.