DeepPlantAllergy: deep learning for explainable prediction of allergenicity in plant proteins

DeepPlantAllergy:利用深度学习对植物蛋白的致敏性进行可解释预测

阅读:1

Abstract

Allergy is an immune response triggered by specific peptides recognized by immune system effectors. While several bioinformatics tools have been developed to predict protein allergenicity, most rely on hand-selected features and lack interpretability. Improved predictive and explainable models are needed, especially for under-studied plant allergens. We present DeepPlantAllergy, a deep learning model that combines Convolutional Neural Networks (CNNs), Bidirectional Long Short-Term Memory (BiLSTM) networks, and Multi-Head Self-Attention (MHSA) to capture both local patterns and long-range dependencies within protein sequences. We evaluated four embedding techniques-including one-hot encoding, SeqVec, ProtBert, and ESM-1B-and employed Integrated gradients to identify residues contributing to allergenicity. Predictive performance was similar for ESM-1B and ProtBert embeddings, with no statistically significant difference, with an F1 score of 93.9% and 93.6% and AUC of 97.74% and 97.8%, respectively. Motif extraction revealed complementary strengths: ProtBert highlighted regions similar to OneHot patterns, while ESM captured distinct segments, and SeqVec identified additional regions overlapping with experimentally validated epitopes. Notably, molecular docking confirmed the biological plausibility of a predicted epitope, supporting the utility of residue-level predictions. DeepPlantAllergy thus offers both high predictive accuracy and interpretable insights, facilitating the discovery of allergenic motifs in under-characterized plant proteins. The source code, datasets used for training and evaluation, trained models, and the full pipeline for prediction and motif identification are available at the GitHub Repository: https://github.com/Lilly-dh/DeepPlantAllergy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。