Abstract
BACKGROUND: Stage IB lung squamous-cell carcinoma (LSCC) lacks individualized survival prediction tools. There's debate on adjuvant chemotherapy's benefit. This study aimed to develop an interpretable machine-learning model for stage IB LSCC survival prediction and re-evaluate postoperative chemotherapy's added value. METHODS: A total of 6445 patients with stage IB LSCC diagnosed between 2000 and 2015 were extracted from the SEER database. Patients from 2000 to 2014 (n = 5740) were split 7:3 into training and internal validation cohorts, while those from 2015 (n = 705) served as the external validation cohort. Six machine-learning algorithms (including logistic regression and gradient-boosting models) were trained to predict 1-, 3-, and 5-year overall survival (OS) with hyperparameter optimization via 10-fold cross-validation. SHAP analysis ensured model interpretability, and 1:1 nearest-neighbor propensity-score matching evaluated chemotherapy benefit in completely resected patients. RESULTS: The LightGBM model achieved the best discriminative performance (AUC = 0.834, 0.828, 0.800 for 1-, 3-, 5-year OS) with excellent generalizability in external validation. SHAP analysis identified treatment modality as the top survival predictor; both surgery alone and surgery plus chemotherapy improved survival, but no significant OS difference was observed between the two strategies across all timepoints, consistent across subgroups (age, tumor size, etc.). Propensity-score matching of 565 patients confirmed similar outcomes (median OS: 64 vs. 62 months; HR = 1.01, 95% CI: 0.82-1.20; p = 0.893). CONCLUSION: This study gives individualized survival estimates for stage IB LSCC, backing a risk-adapted conservative adjuvant treatment approach. High-risk subgroups got no extra benefit from postoperative chemotherapy, which may help integrate precision medicine and shared decision-making in early-stage LSCC management.