Chlamy_ChloroPred: a deep learning-based, highly accurate binary classifier for chloroplast protein prediction in the model microalga, Chlamydomonas reinhardtii, with potential cross-proteome versatility

Chlamy_ChloroPred:一种基于深度学习的高精度二元分类器,用于预测模式微藻莱茵衣藻的叶绿体蛋白,具有潜在的跨蛋白质组通用性。

阅读:1

Abstract

INTRODUCTION: The chloroplast, a living relic of an ancient endosymbiotic interaction between a microalga and a microbe and the principal subcellular organelle responsible for biological CO(2) assimilation, is emerging as a key target for research to enhance photosynthetic efficiency beyond its current limitations. Given that accurate protein localization is a prerequisite for the in-depth scientific investigation and practical application of the membrane-compartmentalized photosynthetic organelle, numerous computational prediction tools have been proposed, yet their accuracy remains unsatisfactory. METHODS: To address the limitation, we herein present Chlamy_ChloroPred, a newly developed deep learning-based framework composed of multi-layered artificial neural networks, carefully designed to perform binary classification of chloroplast proteins in the model photosynthetic microorganism, Chlamydomonas reinhardtii. The model captures locality-aware features of determinant amino acid residues in the chloroplast transit peptide (cTP), generally located within the ~50-amino-acid N-terminal region of mature chloroplast proteins, through the integration of ProtBERT-BFD embeddings, stacked bidirectional long short-term memory (BiLSTM) networks, and an attentive pooling layer. RESULTS AND DISCUSSION: Our model achieved an accuracy of 0.8462 for the C. reinhardtii proteome, outperforming widely used localization predictors, including TargetP 1.1 (0.4970), TargetP 2.0 (0.7396), and PredAlgo (0.7738) under a binary classification scheme. Comparative analyses further demonstrated that Chlamy_ChloroPred exhibits competitive performance relative to the current state-of-the-art model, PB-Chlamy (0.8521), under identical evaluation conditions. Notably, despite being trained solely on the algal proteome, Chlamy_ChloroPred showed substantial cross-species versatility when applied to the proteome of the terrestrial plant, Arabidopsis thaliana, achieving an accuracy of 0.7316 - representing a 12.6% improvement over TargetP 2.0, a predictor with previously demonstrated cross-proteome versatility. This likely stems from the model's robust ability to capture conserved features of chloroplast proteins across proteomes from diverse photosynthetic lineages. CONCLUSION: We developed a deep learning-based framework, Chlamy_ChloroPred, that integrates carefully designed neural layers with low computational complexity, achieving high predictive accuracy and interpretability. We believe that Chlamy_ChloroPred represents a compelling alternative to existing predictors, especially when accurate inference of chloroplast proteins is required.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。