Synthetic surrogates improve power for genome-wide association studies of partially missing phenotypes in population biobanks

合成替代物可提高人群生物样本库中部分缺失表型的全基因组关联研究的统计功效

阅读:1

Abstract

Within population biobanks, incomplete measurement of certain traits limits the power for genetic discovery. Machine learning is increasingly used to impute the missing values from the available data. However, performing genome-wide association studies (GWAS) on imputed traits can introduce spurious associations, identifying genetic variants that are not associated with the original trait. Here we introduce a new method, synthetic surrogate (SynSurr) analysis, which makes GWAS on imputed phenotypes robust to imputation errors. Rather than replacing missing values, SynSurr jointly analyzes the original and imputed traits. We show that SynSurr estimates the same genetic effect as standard GWAS and improves power in proportion to the quality of the imputations. SynSurr requires a commonly made missing-at-random assumption but relaxes the requirements of existing imputation methods by not requiring correct model specification. We present extensive simulations and ablation analyses to validate SynSurr and apply it to empower the GWAS of dual-energy X-ray absorptiometry traits within the UK Biobank.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。