R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Install the latest version of R from CRAN.
Install the latest version of RStudio here. Choose the free RStudio Desktop version for Windows, Mac, or Linux.
Start RStudio. The qtl2 package contains code for haplotype reconstruction, QTL mapping and plotting. Install qtl2 by copying and pasting the following code in the R console.
install.packages("qtl2")
Make sure that the installation was successful by loading the qtl2 library, either by
copy-pasting to the Console or by checking the box next to qtl2
in the RStudio Packages
tab. You shouldn’t get any error messages.
library(qtl2)
mapping
.
File
menu button, then New Project
.New Directory
.New Project
.mapping
as the directory name. Browse to your Desktop to create the project there.Create Project
button.Files
tab to create a data
folder to hold the data, a scripts
folder to
house your scripts, and a results
folder to hold results. Alternatively, you can use the
R console to run the following commands for step 2 only. You still need to create a
project with step 1.dir.create("./data")
dir.create("./scripts")
dir.create("./results")
data
folder. You can download the files from the URLs below and move the files the
same way that you would for downloading and moving any other kind of data.Alternatively, you can copy and paste the following into the R console to download the data.
options(timeout=900) # set the download timeout to 900 seconds from the default 60 seconds to help with large file downloads
# these four commands download files from a url and places them in the data directory you created
download.file(url="https://ndownloader.figshare.com/files/18533342", destfile="./data/cc_variants.sqlite")
download.file(url="https://ndownloader.figshare.com/files/24607961", destfile="./data/mouse_genes.sqlite")
download.file(url="https://ndownloader.figshare.com/files/24607970", destfile="./data/mouse_genes_mgi.sqlite")
download.file(url="ftp://ftp.jax.org/dgatti/qtl2_workshop/qtl2_demo.Rdata", destfile="./data/qtl2_demo.Rdata")
options(timeout=60) # reset the download timeout to the default 60 seconds
# for Windows machine, add the argument mode=wb to the download.file() command
# for example
download.file(url="ftp://ftp.jax.org/dgatti/qtl2_workshop/qtl2_demo.Rdata", destfile="./data/qtl2_demo.Rdata", mode = "wb")
You will need these for the final lesson episodes on SNP association mapping and QTL analysis in Diversity Outbred mice.
Make sure that both the SNP and gene files downloaded correctly by running the following
code. If you get an error, use getwd()
to check the file path
(e.g. "~/Desktop/mapping/data/cc_variants.sqlite"
) carefully or download the files
again. Make sure to use setwd()
to change the file path to the location where you saved
the file.
Check part of the SNP file. It is a very large file, so checking only a sample of the file should do.
# create a function to query the SNP file, then use this new function
# to select SNPs on chromosome 1 from 10 to 11 Mbp
snp_func = create_variant_query_func(dbfile = "~/Desktop/mapping/data/cc_variants.sqlite")
snps = snp_func(chr = 1, start = 10, end = 11)
# check the dimensions of this sample of the SNP file
dim(snps)
You should get a result that is 13150 rows by 16 columns.
Check the gene file in the same way.
# create a function to query the gene file, then select genes in the same region as above
gene_func = create_gene_query_func(dbfile = "~/Desktop/mapping/data/mouse_genes_mgi.sqlite")
genes = gene_func(chr = 1, start = 10, end = 11)
dim(genes) # check the dimensions
You should get a result that is 18 rows by 15 columns.