Abstract
INTRODUCTION: The accurate prediction of dental implant outcomes is critical for optimizing treatment planning and reducing failure rates. Traditional assessments by clinicians often suffer from subjectivity, prompting the exploration of artificial intelligence (AI) to enhance prognostic precision. This study evaluated a deep learning model, specifically a Mask region-based convolutional neural network (Mask R-CNN), to predict single-unit dental implant success using preoperative cone-beam computed tomography (CBCT) scans, comparing its performance with that of expert implantologists. MATERIALS AND METHODS: A retrospective cohort study was conducted at the Department of Prosthodontics, analyzing 210 single-unit implants from 190 patients (January 2022-March 2025) with an 18-month follow-up period. CBCT scans were processed using OsiriX (Bernex, Switzerland), ImageJ, and OpenCV for segmentation and standardization with augmentation via Imgaug to address class imbalance. The Mask R-CNN model, initialized with ImageNet weights, was trained on 168 implants (80%) using five-fold cross-validation, with 42 implants (20%) reserved for testing. The model was implemented in Python (Keras, TensorFlow) and compared with junior (3 years) and senior (15 years) implantologists. Metrics included accuracy, area under the curve (AUC), sensitivity, specificity, precision, F1-score, and Cohen's kappa (κ) for interobserver reliability. Statistical analyses were performed using R and SciPy, with the significance set at p < 0.05. RESULTS: The Mask R-CNN model achieved an accuracy of 0.943, an AUC of 0.943, a sensitivity of 0.943, a specificity of 0.943, a precision of 0.971, and an F1-score of 0.957 on the test set (n = 42; 28 successes and 14 failures). Expert 1 (junior) recorded an accuracy of 0.857 and an AUC of 0.850, and Expert 2 (senior) had 0.829 and 0.818, respectively. Reliability analysis on a 20-case subset showed the model's κ = 0.87 (95% CI: 0.85 - 0.89, p < 0.001), surpassing Expert 1 (κ = 0.69) and Expert 2 (κ = 0.62). The significant predictors of failure included higher preoperative bone density (p = 0.006), wider apical mesiodistal space (p = 0.005), shorter implants (p = 0.008), and higher insertion torque (p = 0.006). The smoking status was not significant (p = 0.711). CONCLUSION: The Mask R-CNN model outperformed expert implantologists in predicting implant outcomes by leveraging CBCT-derived features with high reliability and interpretability. Its integration into clinical workflows could enhance risk stratification, although prospective multicenter validation is needed.