Analytical and computational solution for the estimation of SNP-heritability in biobank-scale and distributed datasets

用于估计生物样本库规模和分布式数据集中的SNP遗传力的分析和计算解决方案

阅读:1

Abstract

For a complex trait, heritability ([Formula: see text]) gives the genetic determination of its variation. Given the emergence of biobank-scale data, a more powerful method is needed to estimate [Formula: see text]. Based on the framework of Haseman-Elston regression (RHE-reg), we integrate a fast randomization algorithm to estimate [Formula: see text], and RHE-reg can tackle biobank-scale data, such as UK Biobank (UKB), very efficiently. Furthermore, we present an analytical solution that balances computational cost and precision of the estimation, a property that is important in dealing with biobank-scale data. We investigated the performance of the RHE-reg in simulated data and also applied it for 81 UKB quantitative traits; as tested in UKB data of nearly 300,000 unrelated individuals, it took on average about 4.5 hours to complete an estimation when used 10 CPUs. We extended the application of RHE-reg into distributed datasets when privacy is not compromised. As shown in UKB and simulated data the performance of RHE-reg was accurate in estimating [Formula: see text]. The software for estimating SNP-heritability for biobank-scale data is released.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。