Abstract
BACKGROUND: Breast cancer remains a major public health issue in Japan, and artificial intelligence (AI)-based computer-aided detection (CAD) systems have the potential to enhance diagnostic performance. We conducted a two-phase evaluation of an AI-CAD trained on non-Japanese data: an external validation using Japanese mammography images and a reader study assessing its impact on diagnostic performance. METHODS: We performed an external validation to evaluate the diagnostic performance of a commercial AI-CAD system using full-field digital mammography (FFDM) images obtained from Japanese patients. This study primarily focused on evaluating the standalone diagnostic performance of an AI-CAD system using a validation cohort of 338 Japanese patients. To further assess its practical utility, a supplementary multi-reader study with 40 selected cases was conducted to observe the interaction between radiologists and AI output. The AI-CAD was developed and trained outside Japan. Diagnostic performance was assessed using sensitivity, specificity, and receiver operating characteristic curve analysis. RESULTS: On validation data, the AI-CAD achieved a sensitivity of 79%, specificity of 89%, and an area under the curve (AUC) of 0.897 (95% CI 0.860-0.934). In a reader study of 40 cases, their performance improved from an AUC of 0.750 to 0.756 (Breast Imaging Reporting and Data System (BI-RADS); p=0.505) and from 0.750 to 0.761 (Likelihood of Malignancy; p=0.110) when assisted by AI-CAD. CONCLUSIONS: Although no statistically significant difference was observed, AI-aided readings yielded AUCs comparable to AI-unaided readings (95% CI overlap); these findings suggest the feasibility of applying an AI‑CAD trained outside Japan to Japanese cases, while larger prospective screening studies are required to establish clinical impact.