Abstract
Integrating high-dimensional multi-omics data is essential for uncovering the coordinated molecular mechanisms underlying cancer progression and improving survival prediction. DNA methylation and mRNA expression represent two tightly coupled regulatory layers; however, many existing approaches either model them independently or rely on linear assumptions that fail to capture the nonlinear cross-omics structure. Here, we propose MT-ADSCCA, a multitask adaptive deep sparse canonical correlation analysis framework that jointly learns correlated latent representations, selects interpretable multi-omics biomarkers, and supports downstream survival modeling. MT-ADSCCA embeds sparse CCA into a nonlinear encoder architecture and uses uncertainty-guided adaptive weighting to stabilize multi-objective training. The selected features were subsequently modeled using a BiLSTM-Cox survival network with genes ordered by chromosomal coordinates to capture local genomic dependencies. We evaluated MT-ADSCCA using event-stratified nested 10-fold cross-validation across three TCGA cohorts: breast invasive carcinoma (BRCA), glioma (GBMLGG), and pan-kidney carcinoma (KIPAN), including 485, 563, and 652 matched multi-omics samples, respectively. MT-ADSCCA achieved the highest concordance indices across all cohorts, outperforming six feature-selection baselines (DA, WGCNA, lmQCM, CCA, OSCCA, DeepCorrSurv) and four survival-model baselines (LASSO-Cox, RSF, MTLSA, DeepSurv). Kaplan-Meier analyses further confirmed a clear separation between the predicted high- and low-risk groups. The selected canonical features were enriched in biologically coherent functional categories, supporting the interpretability of the learned patterns. Together, these results demonstrate that MT-ADSCCA provides a robust and interpretable framework for multi-omics integration and cancer prognosis prediction.