Abstract
BACKGROUND: Breast cancer heterogeneity complicates prognosis and treatment. Metabolic reprogramming, particularly lysine beta-hydroxybutyrylation (Kbhb) driven by ketone bodies, influences the tumor microenvironment. However, the impact of Kbhb on specific breast cancer subpopulations remains unclear. This study aims to identify Kbhb-affected tumor cell subsets and evaluate their prognostic potential. METHODS: We integrated multi-omics data from TCGA, GEO, single-cell RNA sequencing, and spatial transcriptomics. After identifying breast cancer subpopulations influenced by Kbhb-associated genes, we validated the functional role of key genes via molecular experiments. A machine learning-based prognostic model was developed using 101 algorithm combinations. RESULTS: We identified a tumor cell subset susceptible to Kbhb-related metabolic changes, significantly correlating with patient prognosis. SCGB2A2 overexpression reduced invasion, metastasis, and stemness. A prognostic score derived from Kbhb-affected cell markers accurately predicted patient outcomes and immunotherapy response. CONCLUSIONS: Kbhb influences breast cancer heterogeneity, with SCGB2A2 + neoplastic cells serving as valuable prognostic indicators. Targeting these cells may improve therapeutic outcomes. Our model also supports machine learning-guided drug discovery for metabolically vulnerable subpopulations.