R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Install the latest version of R from CRAN.
Install the latest version of RStudio here. Choose the free RStudio Desktop version for Windows, Mac, or Linux.
Start RStudio. The qtl2 package contains code for haplotype reconstruction, QTL mapping and plotting. Install qtl2 by copying and pasting the following code in the R console.
Make sure that the installation was successful by loading the qtl2 library, either by copy-paste to the Console or by checking the box next to
qtl2 in the RStudio Packages tab. You shouldn’t get any error messages.
Filemenu button, then
mappingas the directory name. Browse to your Desktop to create the project there.
Filestab to create a
datafolder to hold the data, a
scriptsfolder to house your scripts, and a
resultsfolder to hold results. Alternatively, you can use the R console to run the following commands for step 2 only. You still need to create a project with step 1.
dir.create("./data") dir.create("./scripts") dir.create("./results")
datafolder. You can download the files from the URLs below and move the files the same way that you would for downloading and moving any other kind of data.
Alternatively, you can copy and paste the following into the R console to download the data.
options(timeout=900) # set the download timeout to 900 seconds from the default 60 seconds to help with large file downloads # these four commands download files from a url and places them in the data directory you created download.file(url="https://ndownloader.figshare.com/files/18533342", destfile="./data/cc_variants.sqlite") download.file(url="https://ndownloader.figshare.com/files/24607961", destfile="./data/mouse_genes.sqlite") download.file(url="https://ndownloader.figshare.com/files/24607970", destfile="./data/mouse_genes_mgi.sqlite") download.file(url="ftp://ftp.jax.org/dgatti/qtl2_workshop/qtl2_demo.Rdata", destfile="./data/qtl2_demo.Rdata") options(timeout=60) # reset the download timeout to the default 60 seconds # for Windows machine, add the argument mode=wb to the download.file() command # for example download.file(url="ftp://ftp.jax.org/dgatti/qtl2_workshop/qtl2_demo.Rdata", destfile="./data/qtl2_demo.Rdata", mode = "wb")
You will need these for the final lesson episodes on SNP association mapping and QTL analysis in Diversity Outbred mice.
Make sure that both the SNP and gene files downloaded correctly by running the following code. If you get an error, use
getwd() to check the file path (e.g.
"~/Desktop/mapping/data/cc_variants.sqlite") carefully or download the files again. Make sure to use
setwd() to change the file path to the location where you saved the file.
Check part of the SNP file. It is a very large file, so checking only a sample of the file should do.
# create a function to query the SNP file, then use this new function # to select SNPs on chromosome 1 from 10 to 11 Mbp snp_func = create_variant_query_func(dbfile = "~/Desktop/mapping/data/cc_variants.sqlite") snps = snp_func(chr = 1, start = 10, end = 11) # check the dimensions of this sample of the SNP file dim(snps)
You should get a result that is 13150 rows by 16 columns.
Check the gene file in the same way.
# create a function to query the gene file, then select genes in the same region as above gene_func = create_gene_query_func(dbfile = "~/Desktop/mapping/data/mouse_genes_mgi.sqlite") genes = gene_func(chr = 1, start = 10, end = 11) dim(genes) # check the dimensions
You should get a result that is 18 rows by 15 columns.