Gene sampling strategies for multi-locus population estimates of genetic diversity (theta)

用于多位点群体遗传多样性估计的基因抽样策略(θ)

阅读:1

Abstract

BACKGROUND: Theoretical work suggests that data from multiple nuclear loci provide better estimates of population genetic parameters than do single loci, but just how many loci are needed and how much sequence is required from each has been little explored. METHODOLOGY/PRINCIPLE FINDINGS: To investigate how much data is required to estimate the population genetic parameter theta (4N(e)mu) accurately under ideal circumstances, we simulated datasets of DNA sequences under three values of theta per site (0.1, 0.01, 0.001), varying in both the total number of base pairs sequenced per individual and the number of equal-length loci. From these datasets we estimated theta using the maximum likelihood coalescent framework implemented in the computer program Migrate. Our results corroborated the theoretical expectation that increasing the number of loci impacted the accuracy of the estimate more than increasing the sequence length at single loci. However, when the value of theta was low (0.001), the per-locus sequence length was also important for estimating theta accurately, something that has not been emphasized in previous work. CONCLUSIONS/SIGNIFICANCE: Accurate estimation of theta required data from at least 25 independently evolving loci. Beyond this, there was little added benefit in terms of decreasing the squared coefficient of variation of the coalescent estimates relative to the extra effort required to sample more loci.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。