Abstract
Identifying multifunctional therapeutic peptides (MFTP) is an important yet complex challenge in the realm of peptide recognition. Unlike monofunctional peptides, MFTP classification requires discerning fine-grained labeling information associated with amino acids, making it more intricate. Existing methods often ignore the nuanced semantics of these labels and fail to fully explore the interplay between peptide sequences and their labels. To address these issues, we propose a multilabel classification method named MultiPep-DLCL. This method uses a deep learning-based model architecture to translate peptide sequences into sequence features by learning the local and global dependencies of multifunctional therapeutic peptide sequences. Additionally, the Label-Sequence Fusion Transformer is employed to efficiently learn high-quality label embeddings by mining effective information from peptide sequences. Finally, the correspondence between sequence features and label embeddings is strengthened through label-sequence contrastive learning. To tackle dataset imbalance, MultiPep-DLCL integrates a multilabel focal dice loss function alongside the traditional cross-entropy loss function. Experimental results demonstrate that the MultiPep-DLCL significantly outperforms existing methods in MFTP recognition.