Abstract
BACKGROUND: Bladder urothelial carcinoma (BUC) remains a highly recurrent and heterogeneous malignancy. Accurate postoperative risk stratification is crucial to guide adjuvant therapy decisions. We hypothesized that integrating Uroplakin III (UPK3A protein)protein expression with systemic inflammation markers and demographic factors could improve prognostic prediction through advanced machine learning(ML) models. METHODS: This retrospective study analyzed 1,032 BUC patients who underwent radical cystectomy. Clinical, pathological, and serological data, including immunohistochemical UPK3A protein expression, were collected. Least Absolute Shrinkage and Selection Operator (LASSO) regression with λ=0.009 (determined via 10-fold cross-validation) was used for feature selection. Nine ML models were trained and validated. Model performance was assessed using Area Under the Receiver Operating Characteristic Curve (AUC-ROC), calibration curves, decision curve analysis (DCA), and clinical impact curves (CIC). Model interpretability was evaluated with SHapley Additive exPlanations (SHAP). RESULTS: Light Gradient Boosting Machine(LightGBM), Random Forest(RF), and Extreme Gradient Boosting (XGBoost) models demonstrated superior performance (AUCs: 0.894/0.754 for RF in training/test sets). SHAP analysis highlighted vascular invasion, tumor necrosis, and UPK3A protein as key predictors. CIC demonstrated strong clinical utility. Integrating UPK3A protein with inflammatory and demographic variables outperformed traditional models. CONCLUSIONS: The combination of UPK3A protein expression with multimodal features significantly enhances prognostic modeling in BUC. This approach offers a promising clinical decision support tool to stratify risk and guide postoperative management. Future studies should incorporate transcriptomic/proteomic data to further validate these findings.