Abstract
BACKGROUND: Lysine crotonylation is a novel post-translational modification (PTM) associated with various diseases, but it has not been fully investigated for its predictive role in hepatitis B virus-related hepatocellular carcinoma (HBV-HCC). Our current study characterized lysine crotonylation-related genes (LCRGs) to identify HBV-HCC molecular clusters and we developed a predictive model for HBV-HCC. METHODS: Microarray gene expression data from 170 HBV-HCC tissues and 181 non-cancerous liver tissues (from patients with HBV) were downloaded from the Gene Expression Omnibus (GEO) database (GSE55092, GSE121248, and GSE47197). We conducted a thorough examination of differentially-expressed LCRGs (DE-LCRGs) expression and immune characteristics in both HBV-HCC patients and control samples (HBV-liver). Based on the DE-LCRGs, we used an unsupervised clustering analysis to categorize the HBV-HCC samples into various lysine crotonylation-related molecular clusters. Weighted gene co-expression network analysis (WGCNA) was performed to select cluster-specific DEGs. Four machine learning (ML) models were developed and the top-performing model was selected. The model's predictive power was integrated into a clinical nomogram to predict patient outcomes, and its performance was evaluated by the area under the curve (AUC) values in a validation set. Additionally, we examined the correlation of the survival analysis with HCC from The Cancer Genome Atlas (TCGA) database. RESULTS: Sixteen LCRGs showed differential expression between the HBV-HCC and HBV liver samples and two distinct molecular clusters were identified. The immune cell infiltration analysis revealed significant differences in the immune microenvironment of the two clusters. The random forest (RF) machine model performed best, with AUC values consistently exceeding 0.9 in both training (AUC =0.943) and validation (AUC =0.901) cohorts. The predictive model incorporating five signature genes showed excellent performance on the external validation dataset. Furthermore, survival analysis revealed that these five genes were associated with poor prognosis in HCC patients. CONCLUSIONS: Our findings have identified two distinct molecular clusters featuring distinct LCRGs expression patterns and developed a predictive model for HBV-HCC, providing both predictive biomarkers and potential immunotherapy targets. However, more HBV-HCC cases and prospective clinical evaluations are required to validate the clinical efficacy of this model.