Abstract
Cancer is mainly caused by a relatively small portion of somatic genome alterations (SGAs), called cancer drivers. Despite success in identifying a good number of cancer drivers, many more remain to be discovered to explain various cancers. Moreover, limited tools are available to identify potential interactions among cancer drivers for a better understanding of oncogenesis. To tackle these challenges, we have developed a novel approach called individualized Bayesian inference using a decision tree (IBI-DT). IBI-DT recognizes the genetic heterogeneity among cancer patients, where different individuals or patient subgroups of distinct genomic makeup may have different drivers. IBI-DT works by constructing smaller subgroups with similar genetic makeup (i.e. patient-like-me subgroups) using a decision tree structure and analyzing multiple trees to identify the SGAs that play a significant role in regulating downstream gene expression patterns at the subgroup and individual levels. This is distinct from population-based approaches, which tend to evaluate the influence of an SGA for the entire population, thereby likely missing low-frequency SGAs that may well explain a small subgroup of cancer patients. Also importantly, IBI-DT can efficiently identify cancer drivers that may have functional interactions. We applied IBI-DT to identify cancer drivers regulating the downstream differential gene expression in cancer patients and compared it to the standard, population-based method of expression quantitative trait loci analysis. Our results show that IBI-DT performs well in identifying both important cancer drivers, especially the low-frequency drivers, and their interactions, allowing for a better understanding of the cancer signaling pathways.