Abstract
Protein post-translational modifications (PTMs) represent a core regulatory mechanism governing protein function and cellular fate. Their dynamic alterations profoundly influence critical biological processes. However, Existing research primarily focuses on single-PTM site prediction and remains confined by single-modality analysis. This study introduces UniGraphPTMs, the first universal PTM site prediction framework based on multimodal fusion and graph neural networks. UniGraphPTMs employs a master-slave architecture to break branch independence through multi-stage interactions. We pioneer the integration of the protein structure pre-training model Saprot with ProtT5 and ESM-C, enabling comprehensive exploration of protein sequence-structure multimodal embeddings. The master branch utilizes xLSTM and Mamba for sequence feature extraction, while the slave branch innovatively constructs a Hierarchical Graph Neural Network for multi-level structural feature extraction. To optimize cross-modal interactions, a novel Low-Rank Cross-Attention Bidirectional Gating fusion module is designed. Furthermore, by incorporating a hierarchical contrastive loss function and pioneering a dual-modality adaptive weighting mechanism, we effectively address the challenge of synergistic learning across multiple losses. Evaluated across 11 datasets encompassing 6 distinct PTM types, UniGraphPTMs outperforms all previous models, demonstrating average improvements of 3.27% in AUC, 4.31% in MCC, and 3.94% in AP. Furthermore, we conducted a proof-of-concept study on multi-PTM joint prediction.