Abstract
Research has consistently indicated that long non-coding RNAs (lncRNAs) significantly influence the development of numerous diseases. Predicting lncRNA-disease associations (LDAs) will contribute to the prevention and treatment of diseases. However, most existing computational models suffer from several challenges: (i) difficulty in capturing complex higher-order relationships among nodes; (ii) limited number of known associations and neglect of consistency of representations across views; (iii) inadequate fusion of multi-view data. In this research, we introduce an innovative end-to-end method named HGCMLDA for LDA prediction. Firstly, HGCMLDA constructs hypergraphs of lncRNAs and diseases based on integrated similarity matrices utilizing Gaussian mixture model and k-nearest neighbor methods and utilizes hypergraph convolutional network to extract high-order representations of lncRNAs and diseases, followed by contrastive learning to capture information interaction between different views that can alleviate the dependence on limited known associations and enhance the node representations in an unsupervised way. Then, HGCMLDA utilizes multi-scale attentional feature fusion, which considers importance weights of different views and aggregates both global and local context to achieve achieve effective and adequate feature fusion. Subsequently, disease features and lncRNA features are also extracted by using variational autoencoder on the association matrix, so that prior knowledge is effectively incorporated for prediction. Finally, the features of these two parts are concatenated, and matrix completion is performed to predict LDA scores. The results of the comparison experiments indicate that HGCMLDA outperforms five state-of-the-art models for LDA prediction. Case studies for specific diseases demonstrate that HGCMLDA can identify novel associations with high accuracy.