Abstract
INTRODUCTION: Metabolic dysfunction-associated steatotic liver disease (MASLD) can progress to metabolic dysfunction-associated steatohepatitis (MASH) and liver fibrosis, contributing to a heavier global health burden. Non-invasive diagnostic tools developed using machine learning (ML) and deep learning (DL), two representative artificial intelligence algorithms, are increasingly being explored for MASH and its related fibrosis assessment. OBJECTIVES: This study aimed to compare the diagnostic performance of different ML and DL models and identify the top-performing models for diagnosing MASH and associated liver fibrosis. METHODS: A systematic review and meta-analysis were conducted across PubMed, Web of Science, Embase and Cochrane Library from inception to May 18, 2025. Pooled area under the receiver operator characteristic curve (AUROC) values with 95 % confidence interval (CI) were calculated. Accuracy, specificity, sensitivity, positive predictive values, and negative predictive values were also recorded. RESULTS: Of 4,314 studies initially identified, 106 met the inclusion criteria, with 35 studies (ML: n = 28; DL: n = 7) providing data for analysis. Logistic Regression and Neural Network are the most commonly algorithms applied in ML and DL, respectively. The pooled AUROCs for diagnosing MASH were 0.833 (95 %CI: 0.806-0.860) for ML models and 0.841 (95 %CI: 0.782-0.900) for DL models. Light Gradient Boosting Machine (LightGBM) and ResNet50 were the best-performing models for diagnosing MASH within ML and DL algorithms, respectively, achieving corresponding AUROCs of 0.920 (95 %CI: 0.916-0.924) and 0.960 (95 %CI: 0.951-0.969). For fibrosis diagnosis, ML models had a pooled AUROC of 0.826 (95 %CI: 0.792-0.860), with Categorical Boosting (CatBoost) achieving the highest AUROC of 0.960 (95 %CI: 0.950-0.970). DL models yielded the pooled AUROC of 0.875 (95 %CI: 0.816-0.934) for fibrosis diagnosis. CONCLUSIONS: Both ML and DL models demonstrated strong diagnostic performance for MASH and liver fibrosis, with DL achieving marginally higher AUROCs. AI-driven approaches show promise in MASLD management.