Abstract
PURPOSE: To avoid over-treatment of prostate cancer patients following screening for elevated prostate-specific antigen levels, keeping patients on active surveillance has been suggested as an alternative to radical treatment. This means recurring visits for patients with low-grade cancer to monitor progression. Our aim was to develop an artificial intelligence-based model that can identify high-risk patients in a cohort of prostate cancer patients on active surveillance. APPROACH: We have developed a multiple instance learning-based framework for predicting the longitudinal outcomes for prostate cancer patients on active surveillance. Our models were trained only on whole-slide images with patient-level labels without using explicit Gleason grades. We employed the UNI-2 foundation model and the well-established attention-based multiple instance learning approach. We further evaluated our models by fitting Cox proportional hazards models and testing them on an external dataset. RESULTS: With this approach, we achieved an average area under the receiver operator characteristic curve of 0.958 (95% CI, 0.957 to 0.959). Fitting Cox models to the predicted probabilities achieved a C -index of 0.824 and a hazard ratio of 2.32. However, all models showed a large drop in performance when evaluated on an external dataset. CONCLUSION: We show that avoiding Gleason grades is beneficial for longitudinal outcome prediction of prostate cancer. Our results suggest that benign prostate tissue contains prognostic information. However, before our models could be used clinically, much more work remains to improve the generalization.