Abstract
In this paper, we propose a flexible Bayesian inference to identify significantly correlated high-dimensional functions with the response variable, which is challenging because the relationship between the response variable and high-dimensional functions is unknown and complex due to the dependence among high-dimensional functions. For example, in genetics pathway-based analysis, a pathway is a set of genes that serve a particular cellular or physiological function. A pathway is a high-dimensional function of genes. A pathway-based analysis can detect subtle changes in expression levels that are not detectable using a gene-based analysis. However, these pathways are not independent of each other. Because the clinical outcome is affected by multiple pathway sets, it is inappropriate to model sets using marginal analysis, such as a single-pathway analysis. Estimating set effects based on a single set ignores the fact that sets interact with each other and, thus, result in false positives or false negatives. In this paper, we propose a generalized fused kernel machine regression to test significantly correlated high-dimensional functions with the response variable, which can be either continuous or binary variables. We develop a data-driven, flexible Bayesian inference for adjusting multiple tests using the Bayes factor that accommodates dependence through a simple yet flexible structure. The benefits of our method are illustrated through a simulation study and our motivating data on genetic pathway analysis related to type II diabetes.