Abstract
Revealing new lncRNA-disease associations (LDAs) is necessary to decipher pathological mechanisms and find new clues of diagnosis and therapy for complex diseases. However, experimental methods for LDA identification need a significant amount of time and cost. Here, we introduce a novel deep learning-based method, LDA-CAMF, to infer LDA candidates. LDA-CAMF first designs a cross-attention mechanism to dynamically decode high-order interdependencies between lncRNAs and diseases, presents a multi-level feature fusion strategy to aggregate hierarchical node representations learned from different layers, and then fuse the original features and the optimized representations to enhance the model expressive ability, finally captures novel LDAs using XGBoost. In comparison with six state-of-the-art methods (SDLDA, LDNFSGB, IPCARF, LDASR, LDA-VGHB, and GEnDDn), LDA-CAMF computed the highest AUCs of 0.9632 and 0.9759, and the best AUPRs of 0.9369 and 0.9783 on lncRNADisease and MNDR under 5-fold cross validation, respectively. Under "cold-start" scenarios for lncRNAs and diseases, LDA-CAMF outperformed the above six baselines under most conditions. Five ablation studies further validated better predictive performance of LDA-CAMF. Visualization of LDP feature distributions also demonstrated the effectiveness of the proposed LDP feature learning strategy. Case studies elucidated that HNF1A-AS1 and BCYRN1 associated with prostate cancer, and HAR1A linked with diabetes. We forecast that LDA-CAMF assists in biomarker identification and mechanism investigation of complex diseases.