Abstract
BACKGROUND: Benign breast disease is an important risk factor for breast cancer development. In this study, we analyzed hematoxylin and eosin-stained whole-slide images from diagnostic benign breast disease biopsies using different deep learning approaches to predict which individuals would subsequently developed breast cancer (cases) or would not (controls). METHODS: We randomly divided cases and controls from a nested case-control study of 946 women with benign breast disease into training (331 cases, 331 control individuals) and test (142 cases, 142 control individuals) groups. We employed customized VGG-16 and AutoML machine learning models for image-only classification using whole-slide images, logistic regression for classification using only clinicopathological characteristics, and a multimodal network combining whole-slide images and clinicopathological characteristics for classification. RESULTS: Both image-only (area under the receiver operating characteristic curve [AUROC] = 0.83 [SE = 0.001] and 0.78 [SE = 0.001] for customized VGG-16 and AutoML models, respectively) and multimodal (AUROC = 0.89 [SE = 0.03]) networks had high discriminatory accuracy for breast cancer. The clinicopathological-characteristics-only model had the lowest AUROC (0.54 [SE = 0.03]). In addition, compared with the customized VGG-16 model, which performed better than the AutoML model, the multimodal network had improved accuracy (AUROC = 0.89 [SE = 0.03] vs 0.83 [SE = 0.02]), sensitivity (AUROC = 0.93 [SE = 0.04] vs 0.83 [SE = 0.003]), and specificity (AUROC = 0.86 [SE = 0.03] vs 0.84 [SE = 0.003]). CONCLUSION: This study opens promising avenues for breast cancer risk assessment in women with benign breast disease. Integrating whole-slide images and clinicopathological characteristics through a multimodal approach substantially improved predictive model performance. Future research will explore deep learning techniques to understand benign breast disease progression to invasive breast cancer.