Performing a genome scan with binary traits
Last updated on 2024-10-21 | Edit this page
Estimated time: 50 minutes
Overview
Questions
- “How do I create a genome scan for binary traits?”
Objectives
- Convert phenotypes to binary values.
- Use logistic regression for genome scans with binary traits.
- Plot and compare genome scans for binary traits.
The genome scans above were performed assuming that the residual
variation followed a normal distribution. This will often provide
reasonable results even if the residuals are not normal, but an
important special case is that of a binary trait, with values 0 and 1,
which is best treated differently. The scan1
function can
perform a genome scan with binary traits by logistic regression, using
the argument model="binary"
. (The default value for the
model
argument is "normal"
.) At present, we
can not account for relationships among individuals in this
analysis.
Let’s look at the phenotypes in the cross again.
R
head(cross$pheno)
OUTPUT
log10_insulin_10wk agouti_tan tufted
Mouse3051 1.399 1 0
Mouse3551 0.369 1 1
Mouse3430 0.860 0 1
Mouse3476 0.800 1 0
Mouse3414 1.370 0 0
Mouse3145 1.783 1 0
There are two binary traits called “agouti_tan”, and “tufted” which are related to coat color and shape.
We perform a binary genome scan in a similar manner to mapping
continuous traits by using scan1
. When we mapped insulin,
there was a hidden argument called model
which told
qtl2
which mapping model to use. There are two options:
normal
, the default, and binary
. The
normal
argument tells
qtl2 to ues a "normal" (least squares) linear model. To map a binary trait, we will include
model
= “binary”` to indicate that the phenotype is a binary trait with values
0 and 1.
R
lod_agouti <- scan1(genoprobs = probs,
pheno = cross$pheno[,'agouti_tan'],
addcovar = addcovar,
model = "binary")
Let’s plot the result and see if there is a peak.
R
plot_scan1(x = lod_agouti,
map = cross$pmap,
main = 'Agouti')
Yes! There is a big peak on chromosome 2. Let’s zoom in on chromosome 2.
R
plot_scan1(x = lod_agouti,
map = cross$pmap,
chr = 2,
main = 'Agouti')
We can use find_peaks
to find the position of the
highest LOD score.
R
find_peaks(scan1_output = lod_agouti,
map = cross$pmap)
This turns out to be a well-known coat color locus for agouti coat color which contains the nonagouti gene. Mice carrying two black alleles will have a black coat, and mice carrying one or no black alleles will have agouti coats.
Challenge 1: How many mice have black coats?
Look at the frequency of the black (0) and agouti (1) phenotypes.
What proportion of the mice are black? Can you use what you learned
about how the nonagouti
locus works and the cross design to
explain the frequency of black mice?
First, get the number of black and agouti mice.
R
tbl <- table(cross$pheno[,"agouti_tan"])
tbl
OUTPUT
0 1
125 356
Then use the number of mice to calculate the proportion with each coat color.
R
tbl / sum(tbl)
OUTPUT
0 1
0.26 0.74
We can see that the black (0) mice occur about 25 % of the time. If
the A
allele causes mice to have black coats when it is
recessive, and if a
is the agouti allele, then, when
breeding two heterozygous (Aa
) mice together, we expect
mean allele frequencies of:
Allele | Frequency | Coat Color |
---|---|---|
AA | 0.25 | black |
Aa | 0.5 | agouti |
aa | 0.25 | agouti |
From this, we can see that about 25% of the mice should have black coats.
Challenge 2: Map the “tufted” phenotype.
Map the tufted phenotype an determine if there are any tall peaks for this trait.
First, map the trait.
R
lod_tufted <- scan1(genoprobs = probs,
pheno = cross$pheno[,"tufted"],
addcovar = addcovar,
model = "binary")
Then, plot the LOD score.
R
plot_scan1(x = lod_tufted,
map = cross$pmap,
main = "Tufted")
There is a large peak on chromosome 17. This is a known locus associated with the Itpr3 gene near 27.3 Mb on chromsome 17.
Key Points
- “A genome scan for binary traits (0 and 1) requires special handling; scans for non-binary traits assume normal variation of the residuals.”
- “A genome scan for binary traits is performed with logistic regression.”