Abstract
Tian et al present a timely machine learning (ML) model integrating biochemical and novel traditional Chinese medicine (TCM) indicators (tongue edge redness, greasy coating) to predict hepatic steatosis in high metabolic risk patients. Their prospective cohort design and dual-feature selection (LASSO + RFE) culminating in an interpretable XGBoost model (area under the curve: 0.82) represent a significant methodological advance. The inclusion of TCM diagnostics addresses metabolic dysfunction-associated fatty liver disease (MAFLD's) multisystem heterogeneity-a key strength that bridges holistic medicine with precision analytics and underscores potential cost savings over imaging-dependent screening. However, critical limitations impede clinical translation. First, the model's single-center validation (n = 711) lacks external/generalizability testing across diverse populations, risking bias from local demographics. Second, MAFLD subtyping (e.g., lean MAFLD, diabetic MAFLD) was omitted despite acknowledged disease heterogeneity; this overlooks distinct pathophysiologies and may limit utility in stratified care. Third, while TCM features ranked among the top predictors in SHAP analysis, their clinical interpretability remains nebulous without mechanistic links to metabolic dysregulation. To resolve these gaps, we propose external validation in multiethnic cohorts using the published feature set (e.g., aspartate aminotransferase/alanine aminotransferase, low-density lipoprotein cholesterol, TCM tongue markers) to assess robustness. Subtype-specific modeling to capture MAFLD heterogeneity, potentially enhancing accuracy in high-risk subgroups. Probing TCM microbiome/metabolomic correlations to ground tongue phenotypes in biological pathways, elevating model credibility. Despite shortcomings, this work pioneers a low-cost screening paradigm. Future iterations addressing these issues could revolutionize early MAFLD detection in resource-limited settings.