UK Biobank Whole-Exome Sequence Binary Phenome Analysis with Robust Region-Based Rare-Variant Test

英国生物银行全外显子组序列二元表型分析及基于区域的稳健罕见变异检验

阅读:1

Abstract

In biobank data analysis, most binary phenotypes have unbalanced case-control ratios, and this can cause inflation of type I error rates. Recently, a saddle point approximation (SPA) based single-variant test has been developed to provide an accurate and scalable method to test for associations of such phenotypes. For gene- or region-based multiple-variant tests, a few methods exist that can adjust for unbalanced case-control ratios; however, these methods are either less accurate when case-control ratios are extremely unbalanced or not scalable for large data analyses. To address these problems, we propose SKAT- and SKAT-O- type region-based tests; in these tests, the single-variant score statistic is calibrated based on SPA and efficient resampling (ER). Through simulation studies, we show that the proposed method provides well-calibrated p values. In contrast, when the case-control ratio is 1:99, the unadjusted approach has greatly inflated type I error rates (90 times that of exome-wide sequencing α = 2.5 × 10(-6)). Additionally, the proposed method has similar computation time to the unadjusted approaches and is scalable for large sample data. In our application, the UK Biobank whole-exome sequence data analysis of 45,596 unrelated European samples and 791 PheCode phenotypes identified 10 rare-variant associations with p value < 10(-7), including the associations between JAK2 and myeloproliferative disease, HOXB13 and cancer of prostate, and F11 and congenital coagulation defects. All analysis summary results are publicly available through a web-based visual server, and this availability can help facilitate the identification of the genetic basis of complex diseases.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。