Abstract
The analysis of systems involving many loci is important in population and quantitative genetics. An important problem is the study of linkage disequilibrium (LD), a concept relevant in genome-enabled prediction of quantitative traits and in exploration of marker-phenotype associations. This article introduces a new estimator of a LD parameter (ρ(2)) that is much easier to compute than a maximum likelihood (or Bayesian) estimate of a tetra-choric correlation. We examined the conjecture that the sampling distribution of the estimator of ρ(2) could be less frequency dependent than that of the estimator of r(2), a widely used metric for assessing LD. This was done via an empirical evaluation of LD in 806 Holstein-Friesian cattle using 771 single-nucleotide polymorphism (SNP) markers and of HapMap III data on 21,991 SNPs (chromosome 3) observed in 88 unrelated individuals from Tuscany. Also, 1600 haplotypes over a region of 1 Mb simulated under the coalescent were used to estimate LD using the two measures. Subsequently, a simulation study compared the new estimator with that of r(2) using several scenarios of LD and allelic frequencies. From these studies, it is concluded that ρ(2) provides a useful metric for the study of LD as the distribution of its estimator is less frequency dependent than that of the standard estimator of r(2).