Abstract
This study aimed to develop an explainable machine learning framework integrating dual-modality ultrasonography and thyroid function parameters for preoperative prediction of central lymph node metastasis (CLNM) in capsular-invasive papillary thyroid carcinoma. A retrospective cohort of 382 pathologically confirmed capsular-invasive papillary thyroid carcinoma patients was stratified into CLNM-negative and CLNM-positive cohorts. After comprehensive univariate and multivariate logistic regression analyses, predictive models were developed using 8 machine learning algorithms (including Logistic Regression, Support Vector Machine, Gradient Boosting Machine, eXtreme Gradient Boosting, K-Nearest Neighbors, Adaptive Boosting, Neural Network, and Categorical Boosting [CatBoost]) and rigorously validated through receiver operating characteristic analysis. Multivariate analysis showed irregular margins, tumor location in lower/mid poles, maximum diameter > 10 mm, rich blood supply, heterogeneous enhancement, and elevated thyroid-stimulating hormone were independent CLNM risk factors. Receiver operating characteristic curves demonstrated the CatBoost model achieved optimal performance (training area under the curve: 0.791; test area under the curve: 0.804). SHapley Additive exPlanations analysis revealed maximum diameter > 10 mm, tumor location in lower/mid poles, and irregular margins were the top 3 contributing features. Tumor size > 10 mm is the most important predictor of CLNM. The CatBoost model demonstrated superior performance and, combined with SHapley Additive exPlanations analysis, provides a clinically applicable tool for personalized surgical planning by identifying high-risk patients who may benefit from prophylactic central lymph node dissection.