Abstract
Background: Microvascular invasion (MVI) is a critical prognostic factor in hepatocellular carcinoma (HCC), but preoperative three-class prediction remains challenging. Radiomics and clinical biomarkers may enable more accurate and individualized assessment. Aim: The aim of this study was to develop and validate a Transformer-based deep learning framework that integrates radiomic and clinical features for direct three-class MVI classification in HCC patients. Methods: This retrospective study included 437 patients with pathologically confirmed hepatocellular carcinoma (HCC) and microvascular invasion (MVI) status from two campuses of a single institution. Patients from Hospital A (n = 305) were randomly divided into training and internal test cohorts, while patients from Hospital B (n = 132) were used as an independent external validation cohort. Radiomic features were extracted from preoperative Gd-BOPTA-enhanced MRI, and clinical laboratory data were collected. A two-stage feature selection strategy, combining univariate statistical testing and recursive feature elimination, was applied. A Transformer-based model was built to classify three MVI categories (M0, M1, M2), and its performance was evaluated in both the internal test cohort and the external validation cohort. Results were compared with those from traditional machine learning models, including Random Forest, Logistic Regression, XGBoost, and LightGBM. Results: On the internal test set (n = 76, Hospital A), the model achieved an accuracy of 0.733 (95% CI: 0.64-0.83), a weighted F1-score of 0.733, and a macro-average AUC of 0.880 (95% CI: 0.807-0.953). The sensitivity and specificity for M1 were 0.56 (95% CI: 0.31-0.78) and 0.86 (95% CI: 0.74-0.94), respectively; for high-risk M2 cases, the sensitivity was 0.73 (95% CI: 0.64-0.81) and the specificity was 0.91 (95% CI: 0.85-0.96). On the external validation set (n = 132, Hospital B), performance remained stable with an accuracy of 0.758, a weighted F1-score of 0.768, and a macro-average AUC of 0.886 (95% CI: 0.833-0.940). Conclusions: This Transformer-based model enables accurate and objective three-class MVI prediction using multi-modal features, supporting individualized surgical planning and improved clinical outcomes. In particular, the ability to preoperatively identify high-risk M2 patients may inform surgical margin design, guide adjuvant therapy strategies, and influence liver transplantation eligibility.