Abstract
BACKGROUND: Hepatocellular carcinoma (HCC) is a critical condition characterized by unchecked cellular growth in the liver, often leading to systemic inflammation and organ failure. Although its complex molecular mechanisms are not fully understood, the primary aim of this study is to enhance the timely and effective diagnosis and treatment of HCC by identifying key molecular targets and pathways. METHODS: Microarray datasets from the NCBI Gene Expression Omnibus were analyzed to identify differentially expressed genes (DEGs) in HCC patients compared with controls. Shared DEGs were subjected to functional enrichment analyses. Weighted gene coexpression network analysis (WGCNA) and single-cell sequencing were used to identify gene modules. Immune cell infiltration was assessed via single-sample gene set enrichment analysis (ssGSEA). In addition, a diagnostic model was constructed via various machine learning algorithms, validated via 10-fold cross-validation, and tested on external datasets. RESULTS: Eight key genes significantly associated with HCC, primarily involved in immune and inflammatory responses, were identified. Enrichment analysis highlighted their roles in critical biological processes and pathways. Immune infiltration analysis revealed distinct immune profiles in HCC patients, differentiating them from healthy controls. A novel 8-gene diagnostic signature (ECM1, HAMP, MT1H, MT1F, CYP1A2, ASPM, CXCL14, and FCN3) demonstrated superior diagnostic performance over existing models, achieving an area under the curve (AUC) of 1.000 in training cohorts with robust validation in external datasets. CONCLUSION: The integration of machine learning with genomic data facilitated the development of a robust diagnostic model for HCC, emphasizing genes involved in immune responses. The identified genes and new diagnostic signatures offer valuable insights into the pathophysiology of HCC and hold potential for enhancing diagnostic strategies and patient management.