Abstract
INTRODUCTION: This study aimed to train machine learning algorithms (MLAs) to detect advanced fibrosis (AF) in metabolic dysfunction-associated steatotic liver disease (MASLD) patients at the level of primary care setting and to explain the predictions to ensure responsible use by clinicians. METHODS: Readily available features of 618 MASLD patients followed up at a tertiary center were used to train five MLAs. AF was defined as liver stiffness ≥9.3 kPa, measured via 2-dimension shear wave elastography (n = 495) or liver biopsy ≥F3 (n = 123). MLAs were compared to Fibrosis-4 index (FIB-4) and non-alcoholic fatty liver disease (NAFLD) fibrosis score (NFS) on 540 MASLD patients from the primary care setting as validation. Feature importance, partial dependence, and shapely additive explanations (SHAPs) were utilized for explanation. RESULTS: Extreme gradient boosting (XGBoost) achieved an AUC = 0.91, outperforming FIB-4 (AUC = 0.78) and NFS (AUC = 0.81, both p < 0.05) with specificity = 76% versus 59% and 48% for FIB-4 ≥1.3 and NFS ≥-1.45, respectively (p < 0.05). Its sensitivity (91%) was superior to FIB-4 (79%). XGBoost confidently excluded AF (negative predictive value = 99%) with the highest positive predictive value (31%), superior to FIB-4 and NFS (all p < 0.05). The most important features were HbA1c and gamma glutamyl transpeptidase (GGT) with a steep increase in AF probability at HbA1c >6.5%. The strongest interaction was between AST and age. XGBoost, but not logistic regression, extracted informative patterns from ALT, low-density lipoprotein cholesterol, and alkaline phosphatase (p < 0.001). One-quarter of the false positives (FPs) were correctly reclassified with only one additional false negative based on the SHAP values of GGT, platelets, and ALT which were found to be associated with a FP classification. CONCLUSIONS: An explainable XGBoost algorithm was demonstrated superior to FIB-4 and NFS for screening of AF in MASLD patients at the primary care setting. The algorithm also proved safe for use as clinicians can understand the predictions and flag FP classifications.