Abstract
We introduce consensus MSClustering, an unsupervised hierarchical network approach that integrates multi-omics data to identify molecular subtypes and conserved pathways across diverse cancers. Using a novel heterogeneity index, we selected 167 key genes with functionally coherent roles validated through Gene Ontology analysis. Applied to 2439 tumors spanning 10 cancer types-and successfully extended to 2675 tumors (12 types) including cases with incomplete molecular data-MSClustering demonstrated: (i) precise classification of major cancer types and breast cancer molecular subtypes; (ii) discovery of novel pan-cancer squamous metaplastic signatures; (iii) exceptional prognostic stratification (log-rank P = 2.3 × 10-46); and (iv) superior performance over existing methods (COCA/SNF) in classification accuracy, cluster robustness, and computational efficiency. The method's multi-scale architecture uniquely resolves breast cancer heterogeneity across biological resolution levels. Pathway analysis further revealed four key oncogenic programs-proteoglycan signaling, chromosomal stability, VEGF-mediated angiogenesis, and drug metabolism-along with disruptions in immune and digestive system functions. This integrative framework marks a significant advancement in cancer genomics by enabling more refined molecular classification, enhanced prognostic insights, and deeper understanding of disease mechanisms. These results highlight the potential of MSClustering to inform the development of clinically relevant biomarkers and support more personalized strategies in precision oncology.