Abstract
Aiming to address key challenges in biomedical text mining, this paper proposes a protein-protein interaction (PPI) extraction model enhanced by entity semantics. Facing issues such as high construction costs of high-quality PPI corpora, data scarcity, and diverse semantic expressions, we design three core modules: (1) An Attention-based Contextual Information Enhancement module that captures relation-relevant contextual semantic information. (2) A large language model-based Multi-dimensional Semantic Information Enhancement module that generates rich entity semantic representations. (3) A Multimodal Language-Interaction Protein Graph Encoder that fuses textual semantics and structural information for relation prediction. Experiments on five standard PPI datasets (AIMed, BioInfer, HPRD50, IEPA, and LLL) demonstrate that the proposed method significantly outperforms existing techniques, achieving optimal performance in average F1-score. Ablation experiments further confirm the effectiveness of each module and their contributions to overall performance. This research not only achieves breakthrough progress in PPI extraction tasks but also provides new technical approaches and research insights for the field of biomedical text mining.