SPAmix: a scalable, accurate, and universal analysis framework for large-scale genetic association studies in admixed populations

SPAmix:一个可扩展、准确且通用的分析框架,适用于混合人群的大规模遗传关联研究

阅读:3

Abstract

BACKGROUND: Inclusion of individuals with diverse or admixed genetic ancestries is crucial to discover novel findings that may be missed by genomics analyses rooted solely in European population. RESULTS: Here, we present an analysis framework, SPAmix, which is scalable to a large-scale biobank data analysis including hundreds of thousands of admixed individuals and is universally applicable to various types of complex traits including quantitative traits, time-to-event traits, ordinal traits, and longitudinal traits. Since no alternative model is fitted, SPAmix primarily focuses on association p values. For each genetic variant, SPAmix uses genotype data and genetic principal components to estimate individual-specific allele frequency, which is subsequently used to calibrate p values via a retrospective analysis. A hybrid strategy including saddlepoint approximation (SPA) can greatly increase the accuracy to analyze rare genetic variants, especially if the phenotypic distribution is unbalanced or extremely unbalanced. We also propose SPAmix(local) to incorporate local ancestry to calculate ancestry-specific p values. To maximize the statistical powers, SPAmix(CCT) is proposed to combine the p values of SPAmix and SPAmix(local) via Cauchy combination. CONCLUSIONS: The SPAmix-based approaches are more accurate than Tractor to address phenotypic variance heterogeneity among ancestries when analyzing quantitative traits and to address an unbalanced case-control ratio when analyzing binary traits. SPAmix(CCT) is an optimal unified approach for various cross-ancestry genetic architectures. Extensive simulation studies and real data analyses of 369,314 UK Biobank individuals from multiple ancestries demonstrated that SPAmix is scalable and can discover novel hits while controlling type I error rates well.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。