Abstract
Anticancer peptides (ACPs) have demonstrated potent antitumor activity and low toxicity, offering considerable potential in cancer therapeutics. Meanwhile, antimicrobial peptides (AMPs)serve as key components of the innate immune defense system. Owing to their broad-spectrum antimicrobial activity and low propensity for inducing resistance, AMPs have attracted considerable attention in the fields of infection control and immunotherapy. Accurate identification of ACPs and AMPs is critical for the discovery of novel therapeutic agents. However, wet-lab identification is often time-consuming, costly, and inefficient, falling short of the demands for highthroughput drug screening. Furthermore, existing computational methods exhibit limitations in feature representation and cross-task prediction capability. To address these challenges, a tool for functional peptide prediction is proposed, namely GP2FI, which consists of two sequential stages: a gene prediction model (MHA-preconv) and a functional peptide identification model (FuncPred-CB). MHA-preconv integrates CNNs with Transformer encoder layers to form a two-stage deep architecture, effectively capturing both local sequence patterns and long-range dependencies. Based on the coding regions identified by MHA-preconv, FuncPred-CB incorporates a pre-trained BERT language model to automatically extract contextual semantic features from amino acid sequences. Experimental results on multiple benchmark datasets demonstrate that MHA-preconv and GP2FI consistently outperforms the state-of-the-art methods in terms of accuracy and other performance metrics.The code for the GP2FI can be found at https://github.com/ma999-mxl/maLBX.git.