Abstract
BACKGROUND: Lung adenocarcinoma (LUAC) patients with micropapillary (MP) and/or solid (S) generally demonstrate a poorer survival prognosis. In the diagnosis and treatment of stage IA LUAC, precisely establishing personalized treatment strategies for patients is crucial for both clinical practice and scientific investigation. Our study aims to develop a novel prediction model based on machine learning (ML) to predict the probability of MP/S patterns in stage IA LUAC patients. METHODS: Our retrospective analysis was conducted on 1,933 patients diagnosed with stage IA LUAC via postoperative pathological staging, focusing on evaluating MP/S pattern presence. MP/S-positive patients were matched with negative patients at a 1:2 ratio. Univariate logistic regression and Lasso regression were used to select variables with independent prognostic significance. The performance of the traditional logistic regression model was compared with nine ML models based on the identification and calibration. RESULTS: Nodule type, spiculation, Carcinoembryonic antigen level, maximum solid component diameter, median CT value, and CT value range were identified as independent influencing factors for predicting MP/S patterns. The K-Nearest Neighbors (KNN) model performed best among all ten models. The internal validation indicated an area under the curve (AUC) of 0.790, a Brier score of 0.167, and a Hosmer-Lemeshow (HL) test P value of 0.817, while external validation yielded an AUC of 0.790, a Brier score of 0.167, and a HL test P value of 0.120. Shapley additive explanation analysis revealed "nodule type" could alter the predicted probability of MP/S component presence by 13.6%, establishing it as a significant factor. CONCLUSION: An interpretable KNN model was successfully developed to predict the presence of MP/S components in stage IA LUAC patients, demonstrating superior predictive performance. Accurate evaluation of relevant tumor characteristics possesses substantial clinical significance, as it enables guidance on the optimization of surgical approaches to enhance patient prognosis.