Abstract
Brain Genome Association (BGA) study, which investigates the associations between brain structure/function (characterized by neuroimaging phenotypes) and genetic variations (characterized by Single Nucleotide Polymorphisms (SNPs)), is important in pathological analysis of neurological disease. However, the current BGA studies are limited as they did not explicitly consider the disease labels, source importance, and sample importance in their formulations. We address these issues by proposing a robust and discriminative BGA formulation. Specifically, we learn two transformation matrices for mapping two heterogeneous data sources (i.e., neuroimaging data and genetic data) into a common space, so that the samples from the same subject (but diffrent sources) are close to each other, and also the samples with diffrent labels are separable. In addition, we add a sparsity constraint on the transformation matrices to enable feature selection on both data sources. Furthermore, both sample importance and source importance are also considered in the formulation via adaptive parameter-free sample and source weightings. We have conducted various experiments, using Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, to test how well the neuroimaging phenotypes and SNPs can represent each other in the common space.