A Deep Learning Model to Predict the ncRNA-Protein Interactions Based on Sequences Information Only

仅基于序列信息预测ncRNA-蛋白质相互作用的深度学习模型

阅读:2

Abstract

Noncoding RNAs (ncRNAs) play significant roles in multiple fundamental biological processes, in particular, ncRNAs interactions provide valuable insights into protein synthesis, controlling gene expression, RNA processing, regulation of localization, etc. The dysregulation of ncRNA interaction may cause severe diseases including cancer. Therefore, developing computational methods for investigating ncRNA-protein interaction has become a problem of interest for researchers. In this study, we proposed a novel deep learning (DL) model named RPI-SDA-XGBoost for predicting the interaction between ncRNA and proteins. We utilized the 3-mer conjoint triad feature (CTF) to encode the protein sequence, and the 4-mer frequency to encode the RNA sequence, resulting in the extraction of a total of 599-dimensional vector features. The DL approach is developed based on stack denoising autoencoder (SDA) to discover high-level hidden characteristics from 2 separate networks representing proteins and ncRNAs. Composition of features were fed into XGBoost based meta-learner for the final prediction. Proposed model, RPI-SDA-XGBoost, outperformed most of the individual baseline models and significantly improved the performance on multiple benchmark data sets. We validate the generalization power of the proposed model on five benchmark data sets, namely, RPI_ 369, RP_I488, RPI_1807, RPI_ 2241, and NPInterv2.0. RPI-SDA-XGBoost achieved similar levels of state-of-the-art accuracy on data sets RPI_488, RPI_1807, and RPI_NPInter v2.0. Proposed model achieved the best precision of 87.9% and 94.6% in the largest two data sets RPI_ 2241, and RPI_NPInter v2.0, respectively. We believe the proposed model provides useful direction for upcoming biological research and suggesting more sophisticated computational approaches are warranted in near future for ncRNA protein interaction predictions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。