Identifying protein succinylation sites using generative transformer and a two-dimensional representation with a deep capsule network

利用生成式Transformer和基于深度胶囊网络的二维表示来识别蛋白质琥珀酰化位点

阅读:2

Abstract

Protein succinylation is a vital post-translational modification that regulates diverse cellular processes. Accurate identification of succinylation sites is crucial for understanding protein function and development of targeted drugs. In this study, we propose an intelligent computational model, iSucc-SnCNs, which encodes protein sequences using the ProtGPT2-based protein language model. Structural representations are derived from SMR and PSSM matrices to extract SMR-HOG, SMR-DCT, and PSSM-DWT features. The BTGA+KNN algorithm selects top-ranked features from the hybrid feature vector. Finally, a self-normalized capsule neural network (Sn-CapsNet) is trained using a BTGA-based optimal feature set. The proposed iSucc-SnCNs achieved an accuracy of 92.92% and an AUC of 0.96, outperforming traditional models by 17%. The generalization of the iSucc-SnCNs model on two independent datasets (Ind-I and Ind-II) demonstrated improved performance by approximately 13% and 2%, respectively. These results highlight iSucc-SnCNs as a robust and efficient framework for large-scale succinylation site prediction and protein function analyses in drug discovery.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。