Abstract
With rapid advancements in biomedical research, the volume of available peptide data has significantly expanded, heightening the need for computational tools capable of efficiently identifying multifunctional therapeutic peptides (MFTPs). These peptides hold great promise as novel therapies across various disease contexts. However, existing computational frameworks face considerable challenges, such as significant class imbalance and difficulties in effectively capturing latent peptide characteristics, leaving substantial room for improvement in predictive capabilities. In this study, we introduce SeekTP, a novel computational framework designed to accurately identify MFTPs from amino acid sequences by integrating diverse predicted structural and sequential properties from multiple categories. The model constructs graph representations from predicted structural, sequential, and embedding information and employs a graph attention network to effectively encode complex interactions. Concurrently, self-attention mechanisms and convolutional neural networks are applied to capture intricate patterns in peptide properties. A feed-forward neural network serves as the final prediction layer. Extensive benchmarking experiments illustrate that SeekTP is able to achieve more delightful manifestations compared with state-of-the-art methods on independent test datasets, explaining its advantages in classification tasks. Additionally, we systematically analyze the discriminative power achieved by various combinations of feature representations and modeling strategies, underscoring the robustness and flexibility of our approach.