Decoding potential lncRNA and disease associations through graph representation learning and gradient boosting with histogram.

阅读:3
作者:Tang Lili, Liu Longlong, Jiang Yan, Yuan Yi
Long noncoding RNAs (lncRNAs) are important regulators and promising targets for complex diseases. They have manifested dense relationships with various diseases. Although laboratory techniques have validated many lncRNA-disease associations (LDAs), they are costly, laborious, and time-consuming. This study introduces LDA-GMCB, an LDA inference model, by leveraging graph embedding learning, multi-head self-attention mechanism (MSA) with convolutional neural network (CNN), low-rank singular value decomposition (SVD), and histogram-based gradient boosting (HGBoost). For all lncRNAs and diseases, LDA-GMCB first deciphers their nonlinear features by incorporating graph embedding learning and MSA with CNN, then captures their linear features through low-rank SVD, and finally infers their relationships based on HGBoost. LDA-GMCB was compared with four baselines (i.e., SDLDA, LDNFSGB, IPCARF and LDA-VGHB) under 5-fold cross validation and two cold start scenarios, and four popular classifiers (i.e., multi-layer perceptron, SVM, random forest, and XGBoost). Additionally, LDA-GMCB implemented ablation study. The outcomes demonstrated that LDA-GMCB greatly surpassed the above models and gained significant improvement on two public databases (i.e., lncRNADisease and MNDR) under most conditions. Moreover, LDA-GMCB was further applied to infer potential lncRNAs for Alzheimer's disease and Parkinson's disease. It identified that DGCR5 and HIF1A could link with the two diseases, respectively. We hope that LDA-GMCB help infer potential lncRNAs for various complex diseases. LDA-GMCB is freely available at https://github.com/smiling199/LDA-GMCB .

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。