Abstract
INTRODUCTION: Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgkin lymphoma (NHL) in humans, and it is a highly heterogeneous malignancy with a 40% to 50% risk of relapsed or refractory (R/R), leading to a poor prognosis. So early prediction of R/R risk is of great significance for adjusting treatments and improving the prognosis of patients. METHODS: We collected clinical information and H&E images of 227 patients diagnosed with DLBCL in Xuzhou Medical University Affiliated Hospital from 2015 to 2018. Patients were then divided into R/R group and non-relapsed & non-refractory group based on clinical diagnosis, and the two groups were randomly assigned to the training set, validation set and test set in a ratio of 7:1:2. We developed a model to predict the R/R risk of patients based on clinical features utilizing the random forest algorithm. Additionally, a prediction model based on histopathological images was constructed using CLAM, a weakly supervised learning method after extracting image features with convolutional networks. To improve the prediction performance, we further integrated image features and clinical information for fusion modeling. RESULTS: The average area under the ROC curve value of the fusion model was 0.71±0.07 in the validation dataset and 0.70±0.04 in the test dataset. This study proposed a novel method for predicting the R/R risk of DLBCL based on H&E images and clinical features. DISCUSSION: For patients predicted to have high risk, follow-up monitoring can be intensified, and treatment plans can be adjusted promptly.