Performing a genome scan with binary traits


Teaching: 20 min
Exercises: 10 min
  • How do I create a genome scan for binary traits?

  • Convert phenotypes to binary values.

  • Use logistic regression for genome scans with binary traits.

  • Plot and compare genome scans for binary traits.

The genome scans above were performed assuming that the residual variation followed a normal distribution. This will often provide reasonable results even if the residuals are not normal, but an important special case is that of a binary trait, with values 0 and 1, which is best treated differently. The scan1 function can perform a genome scan with binary traits by logistic regression, using the argument model="binary". (The default value for the model argument is "normal".) At present, we can not account for relationships among individuals in this analysis.

Let’s first turn our two phenotypes into binary traits by thresholding at the median. One would generally not do this in practice; this is just for illustration.

bin_pheno <- apply(iron$pheno, 2, function(a) as.numeric(a > median(a)))
rownames(bin_pheno) <- rownames(iron$pheno)

We now perform the genome scan as before, including model="binary" to indicates that the phenotypes are binary traits with values 0 and 1.

out_bin <- scan1(pr, bin_pheno, Xcovar=Xcovar, model="binary")

Here is a plot of the two LOD curves.

par(mar=c(5.1, 4.1, 1.1, 1.1))
ymx <- maxlod(out_bin)
plot(out_bin, map, lodcolumn=1, col="slateblue", ylim=c(0, ymx*1.02))
plot(out_bin, map, lodcolumn=2, col="violetred", add=TRUE)
legend("topleft", lwd=2, col=c("slateblue", "violetred"), colnames(out_bin), bg="gray90")

We can use find_peaks as before.

find_peaks(out_bin, map, threshold=3.5, drop=1.5)

Key Points

  • A genome scan for binary traits (0 and 1) requires special handling; scans for non-binary traits assume normal variation of the residuals.

  • A genome scan for binary traits is performed with logistic regression.