Linkage disequilibrium of single-nucleotide polymorphism data: how sampling methods affect estimates of linkage disequilibrium

单核苷酸多态性数据的连锁不平衡:抽样方法如何影响连锁不平衡的估计

阅读:2

Abstract

Linkage disequilibrium (LD) is an important measure used in the analysis of single-nucleotide polymorphism (SNP) data. We used the Genetic Analysis Workshop 16 (GAW16) Framingham Heart Study 500 k SNP data to explore the effect of sampling methods on estimating of LD for SNP data. METHOD AND DATA: We found 332 trios in the GAW16 Framingham SNP data. Repeated random samples without replacement, of different sizes of trios and independent individuals, are drawn from these 332 trios. For each sample, the LD is calculated using the Haploview program for the chromosome 1 SNP data. Percents of D' > 0.8 and r2 > 0.8 are calculated for different distance bins based on the Haploview output. The results are summarized by sample size and sampling methods to give us an overall view of the effect of sample size and sampling methods on the LD estimation. RESULTS: Trios design gave stable estimates. A sample of 30 to 40 trios gave estimates of percent of LD > 0.8 very close to those from 332 trios. When independent individuals are used, the estimates are less stable and are different from those obtained from the 332 trios for both D' and r2, with larger differences for D'. CONCLUSION: Our results suggest that trio design gives a stable estimate of LD. Therefore it may be more suitable for LD analysis than using independent individuals. We must be cautious when comparing the LD estimates from trios, and those from independent individuals.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。