Abstract
RNA-protein interactions (RPIs) are essential for many biological functions and are associated with various diseases. Traditional methods for detecting RPIs are labor-intensive and costly, necessitating efficient computational methods. In this study, we proposed a novel sequence-based RPI prediction framework based on graph neural networks (GNNs) that addressed key limitations of existing methods, such as inadequate feature integration and negative sample construction. Our method represented RNAs and proteins as nodes in a unified interaction graph, enhancing the representation of RPI pairs through multi-feature fusion and employing self-supervised learning strategies for model training. The model's performance was validated through five-fold cross-validation, achieving accuracy of 0.880, 0.811, 0.950, 0.979, 0.910, and 0.924 on the RPI488, RPI369, RPI2241, RPI1807, RPI1446, and RPImerged datasets, respectively. Additionally, in cross-species generalization tests, our method outperformed existing methods, achieving an overall accuracy of 0.989 across 10 093 RPI pairs. Compared with other state-of-the-art RPI prediction methods, our approach demonstrates greater robustness and stability in RPI prediction, highlighting its potential for broad biological applications and large-scale RPI analysis.