Abstract
Understanding gene-disease associations is important for uncovering pathological mechanisms and identifying potential therapeutic targets. Knowledge graphs can represent and integrate data from multiple biomedical sources, but lack individual-level information on target organ structure and function. Here we develop CardioKG, a knowledge graph that integrates over 200,000 computer vision-derived cardiovascular phenotypes from biomedical images with data extracted from 18 biological databases to model over a million relationships. We used a variational graph auto-encoder to generate node embeddings from the knowledge graph to predict gene-disease associations, assess druggability and identify drug repurposing strategies. The model predicted genetic associations and therapeutic opportunities for leading causes of cardiovascular disease, which were associated with improved survival. Candidate therapies included methotrexate for heart failure and gliptins for atrial fibrillation, and the addition of imaging data enhanced pathway discovery. These capabilities support the use of biomedical imaging to enhance graph-structured models for identifying treatable disease mechanisms.