Accurate prediction of protein-ATP binding sites based on a protein pretrained large language model and a fractional-order convolutional neural network

基于蛋白质预训练大型语言模型和分数阶卷积神经网络的蛋白质-ATP结合位点精确预测

阅读:1

Abstract

ATP, a high-energy phosphate compound also known as adenosine triphosphate, serves as a direct energy source for living organisms. Proteins, composed of amino acids, are fundamental macromolecules and essential building blocks of life. The interaction between proteins and ATP is crucial for various biological processes, including movement, regulation, and metabolism. Predicting the interaction between proteins and ATP is of paramount importance, particularly in modelling their binding sites and conducting downstream studies; therefore, advancements in techniques hold significant value for disease prevention, diagnosis, treatment, and drug design. However, current research methods face numerous challenges, such as the need for various algorithms to extract multilevel features and then integrate them into one deep learning model, which is inflexible and may result in the loss of important information implied in sequences. In this study, we propose a novel Large Language Model (LLM)-based model, the pretrained fractional-order deep convolution neural network (PFDCNN), to predict protein-ATP binding sites through sequence information that is extracted from protein sequence features by a pretrained protein large language model; then, we employ a deep convolutional neural network with fractional-order backpropagation for prediction and modify the loss function to control the impact of data imbalance. We trained and tested our model on several protein-ATP binding site datasets, and the comparison results revealed that the PFDCNN exhibited excellent generalization ability, with accuracies of 0.99 and 0.984 and AUC values of 0.965 and 0.941, respectively, on two famous protein-ATP datasets, surpassing those of most existing protein binding site prediction models.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。