Feature selection with interactions in logistic regression models using multivariate synergies for a GWAS application

利用多元协同效应在逻辑回归模型中进行交互特征选择,以应用于全基因组关联分析(GWAS)

阅读:1

Abstract

BACKGROUND: Genotype-phenotype association has been one of the long-standing problems in bioinformatics. Identifying both the marginal and epistatic effects among genetic markers, such as Single Nucleotide Polymorphisms (SNPs), has been extensively integrated in Genome-Wide Association Studies (GWAS) to help derive "causal" genetic risk factors and their interactions, which play critical roles in life and disease systems. Identifying "synergistic" interactions with respect to the outcome of interest can help accurate phenotypic prediction and understand the underlying mechanism of system behavior. Many statistical measures for estimating synergistic interactions have been proposed in the literature for such a purpose. However, except for empirical performance, there is still no theoretical analysis on the power and limitation of these synergistic interaction measures. RESULTS: In this paper, it is shown that the existing information-theoretic multivariate synergy depends on a small subset of the interaction parameters in the model, sometimes on only one interaction parameter. In addition, an adjusted version of multivariate synergy is proposed as a new measure to estimate the interactive effects, with experiments conducted over both simulated data sets and a real-world GWAS data set to show the effectiveness. CONCLUSIONS: We provide rigorous theoretical analysis and empirical evidence on why the information-theoretic multivariate synergy helps with identifying genetic risk factors via synergistic interactions. We further establish the rigorous sample complexity analysis on detecting interactive effects, confirmed by both simulated and real-world data sets.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。