Abstract
Lung adenocarcinoma (LUAD) displays significant morphological and molecular heterogeneity both within and between tumors. This heterogeneity, coupled with the complexities of the tumor microenvironment (TME), notably influences LUAD progression and patient prognosis. Integrating mass-spectrometry-based proteomic data from human tumors with corresponding multi-omics data opens significant opportunities for comprehensive and systematic cancer proteogenomic analyses. In our study, we analyzed LUAD proteogenomic data from the Cancer Proteome Atlas Program (CPTAC) to conduct a systematic molecular classification by integrating multi-omics data (genomic, transcriptomic, and proteomic) using ten clustering algorithms. This approach successfully identified three distinct molecular subtypes with notable biological heterogeneity: metabolic pathway activation, cell cycle pathway activation, and immune modulation. The metabolic pathway activation subtype was characterized by significant upregulation of metabolic pathways, this subtype showed enhanced mRNA expression and protein abundance of genes related to fatty acid and bile acid metabolism (e.g., MLYCD, ACSM3, ACSL5, ACSS1, and MAOA). It also exhibited a notably higher frequency of EGFR mutations compared to other subtypes. The cell cycle pathway activation subtype was defined by the activation of cell cycle signaling pathways, this subtype demonstrated amplification of the PCNA gene and significantly increased mRNA and protein expression levels of key cell cycle-related genes, such as CCNB1 and CDK1. Importantly, a high mutation frequency, copy number deletions, and downregulation of the tumor suppressor gene STK11 were observed, leading to the inactivation of its tumor suppressor function. This subtype also showed potential sensitivity to various cell cycle inhibitors. The immune modulation subtype was characterized by high immune cell infiltration and low tumor purity. This subtype exhibited multi-omics activation of the Ras signaling pathway, including MET amplification, increased GRB2 expression, decreased PEBP1 expression, and RET activating mutations. These findings suggest that this subtype may benefit from a combination of immunotherapy and targeted therapy against the Ras signaling pathway. Based on the expression profiles of 300 subtype-specific marker genes, we validated the molecular characteristics and clinical relevance of these three subtypes in the TCGA-LUAD and GSE50081 cohorts. Moreover, to enhance clinical applicability, we developed a rapid and cost-effective multi-instance model based on pathological images combined with deep learning techniques. This deep learning model demonstrated excellent performance in multi-omics subtype predictions. In, conclusion, this study systematically elucidates the molecular heterogeneity of LUAD through a multi-omics integration strategy, establishes a clinically relevant molecular classification system, and provides a theoretical foundation for developing personalized treatment plans for LUAD.