Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome-Wide Studies

在全基因组研究中使用条件对称多维高斯混合模型检验大量复合零假设

阅读:1

Abstract

Causal mediation, pleiotropy, and replication analyses are three highly popular genetic study designs. Although these analyses address different scientific questions, the underlying statistical inference problems all involve large-scale testing of composite null hypotheses. The goal is to determine whether all null hypotheses - as opposed to at least one - in a set of individual tests should simultaneously be rejected. Recently, various methods have been proposed for each of these situations, including an appealing two-group empirical Bayes approach that calculates local false discovery rates (lfdr). However, lfdr estimation is difficult due to the need for multivariate density estimation. Furthermore, the multiple testing rules for the empirical Bayes lfdr approach can disagree with conventional frequentist z-statistics, which is troubling for a field that ubiquitously utilizes summary statistics. This work proposes a framework to unify two-group testing in genetic association composite null settings, the conditionally symmetric multidimensional Gaussian mixture model (csmGmm). The csmGmm is shown to demonstrate more robust operating characteristics than recently-proposed alternatives. Crucially, the csmGmm also offers interpretability guarantees by harmonizing lfdr and z-statistic testing rules. We extend the base csmGmm to cover each of the mediation, pleiotropy, and replication settings, and we prove that the lfdr z-statistic agreement holds in each situation. We apply the model to a collection of translational lung cancer genetic association studies that motivated this work.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。