Abstract
BACKGROUND: Hepatocellular carcinoma (HCC), a primary contributor to cancer-associated mortality, necessitates enhanced early detection. This study evaluated machine learning models that merge methylated SEPTIN9 (SEPT9) and secreted frizzled-related protein 2 (SFRP2) within circulating cell-free DNA (cfDNA) to detect HCC. METHODS: A cohort of 165 healthy volunteers, 24 precancerous patients of HCC and 112 HCC patients were divided into training and validation sets. Methylated SEPT9 and SFRP2 (mSEPT9/mSFRP2) were detected using real-time PCR. Based on those methylation biomarkers and/or conventional biomarkers (CEA, AFP, CA125, and CA19-9), six machine learning algorithms, including Random Forest (RF), were employed to establish models for the training set. Models were evaluated for area under the ROC curve (AUC), sensitivity, and specificity, and subsequently validated in the validation set. RESULTS: The RF model outperformed other models. In training, it achieved an AUC of 0.834 (95% CI: 0.745-0.923), exhibiting 69.3% sensitivity and 80.6% specificity for the methylation-specific signature group (mSS group: mSEPT9/mSFRP2). In validation, the RF model for the mSS group showed an AUC of 0.865 (95% CI: 0.811-0.946), with 85.4% sensitivity and 71.4% specificity. CONCLUSIONS: The RF-based model integrating mSEPT9/mSFRP2 in cfDNA can be a promising approach for HCC diagnosis.