Abstract
Hepatocellular carcinoma (HCC) is a major cause of cancer-related mortality worldwide, underscoring the need for improved non-invasive diagnostic strategies. In this study, we developed an interpretable machine learning model using extracellular vesicle (EV)-derived RNA signatures from the exoRBase 3.0 database. Six machine learning algorithms were evaluated, among which a Deep Neural Network achieved numerically higher discriminative performance under the experimental setup used in this study (AUC = 0.8877) on an internal hold-out test set. A panel of ten diagnostic mRNAs (MTRNR2L8, HBB, PF4, FTL, MTRNR2L12, TMSB4X, PPBP, OST4, ACTB, and S100A9) were identified, with MTRNR2L8 showing the strongest contribution to model predictions. SHapley Additive exPlanations and Kolmogorov–Arnold Networks were applied to enhance interpretability and to characterize nonlinear relationships between EV-derived gene expression features and classification outcomes. An online prediction interface was implemented as a demonstration tool to illustrate potential applicability. Overall, this study presents an exploratory, proof-of-concept framework for EV-based HCC classification. Further validation in independent and prospective cohorts will be required before clinical application can be considered. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-40020-9.