Abstract
Interleukin-2 (IL-2) is a pivotal cytokine involved in regulating immune responses, particularly in the activation and proliferation of T cells. Interleukin-2 (IL-2) is a key regulator of immune responses, making the identification of IL-2-inducing peptides vital for advancing immunotherapy and vaccine development. In this study, we present a computational approach for predicting IL-2-inducing peptides. Positive and negative peptide datasets were obtained from the Immune Epitope Database (IEDB), and relevant features were extracted using the pfeature, ifeature algorithms and large language models like ProtBERT. Our extra tree based model, developed on Dipeptide deviation from Expected mean (DDE) features, achieves an external validation accuracy of 79.88%, sensitivity of 81.24%, and a Matthews Correlation Coefficient (MCC) of 0.6. This model was then used to predict IL-2 inducing peptides from the global viral RefSeq proteome comprising 14,365 unique viruses encoding 374,209 proteins, yielding 155.68 million peptides. This analysis led to the identification of several promising IL-2-inducing viral encoded candidates. A literature review confirmed that some of the viral proteins encoding these peptides had been experimentally validated to induce IL-2, thereby supporting the reliability of our prediction pipeline. To facilitate broader usage, we have made the tool available through a user-friendly web server (http://www.soodlab.com/il2pepscan/), enabling the scientific community to assess the IL-2-inducing potential of their peptides of interest. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-35977-6.