Abstract
The growth, development, and differentiation of multicellular organisms are primarily driven by intercellular communication, which coordinates the activities of diverse cell types. This cell-to-cell signaling is typically mediated by various types of protein-protein interactions, including ligand-receptor; receptor-receptor, and extracellular matrix-receptor interactions. Currently, computational methods for inferring ligand-receptor communication primarily depend on gene expression data of ligand-receptor pairs and spatial information of cells. Some approaches integrate protein complexes; transcription factors; or pathway information to construct cell communication networks. However, few methods consider the critical role of protein-protein interactions (PPIs) in intercellular communication networks, especially when predicting communication between different cell types in the absence of cell type information. These methods often rely on ligand-receptor pairs that lack PPI evidence, potentially compromising the accuracy of their predictions. To address this issue, we propose CellGAT, a framework that infers intercellular communication by integrating gene expression data of ligand-receptor pairs, PPI information, protein complex data, and experimentally validated pathway information. CellGAT not only builds a priori models but also uses node embedding algorithms and graph attention networks to build cell communication networks based on scRNA-seq (single-cell RNA sequencing) datasets and includes a built-in cell clustering algorithm. Through comparisons with various methods, CellGAT accurately predicts cell-cell communication (CCC) and analyzes its impact on downstream pathways; neighboring cells; and drug interventions.