Abstract
OBJECTIVE: This study aimed to evaluate the coherence between data heterogeneity and model complexity by comparing seven convolutional neural network (CNN) architectures-trained with and without ImageNet pretraining-in a multiclass framework for the histopathological classification of three odontogenic tumors: adenomatoid odontogenic tumor, ameloblastoma, and ameloblastic carcinoma. The goal was to investigate how transfer learning influences performance and diagnostic reliability in a clinically relevant context characterized by overlapping histological patterns. METHODS: An international, multicenter cross-sectional dataset of 64 hematoxylin- and eosin-stained whole slide images was analyzed, including adenomatoid odontogenic tumor (n = 16), ameloblastoma (n = 27), and ameloblastic carcinoma (n = 21). Seven CNN models (DenseNet121, EfficientNetV2B0, InceptionV3, MobileNet, ResNet50, VGG16, and Xception) were trained and tested on 455,107 patches (224 × 224 pixels). Performance was assessed using accuracy, balanced accuracy, sensitivity, specificity, F1-score, and AUC. RESULTS: Without ImageNet pretraining, DenseNet121 achieved the highest performance (accuracy = 0.73, balanced accuracy = 0.74, AUC = 0.78, specificity = 0.84, sensitivity = 0.65), followed by EfficientNetV2B0 (accuracy = 0.67, balanced accuracy = 0.68, sensitivity = 0.54). When ImageNet pretraining was applied, performance improved across all architectures. EfficientNetV2B0 reached the best overall results (accuracy = 0.79, balanced accuracy = 0.81, AUC = 0.91, specificity = 0.88, sensitivity = 0.74), while DenseNet121 maintained consistent performance (accuracy = 0.72, balanced accuracy = 0.74, AUC = 0.85, specificity = 0.84, sensitivity = 0.64). CONCLUSION: Transfer learning with ImageNet weights enhanced the performance of most CNNs, with EfficientNetV2B0 showing the greatest responsiveness to pretraining and DenseNet121 demonstrating intrinsic robustness to initialization. These results highlight the potential of CNN-based frameworks to support the differential diagnosis of odontogenic tumors-an inherently challenging task due to morphological overlap-while establishing reproducible methodological baselines that contribute to the global development of explainable, ensemble-based, and clinically reliable AI systems in oral pathology.