Abstract
Background/Objectives: Odontogenic cysts and ameloblastomas (AB) are mostly asymptomatic, often discovered later due to severe symptoms, and only histopathological examination provides definitive diagnosis. AI-assisted diagnostics offer a fast, noninvasive, painless diagnostic tool. To our knowledge, this is the first meta-analysis aiming to evaluate the classification, detection, and segmentation performance of artificial intelligence (AI) for odontogenic cysts and ABs as distinct entities and to determine if it can achieve clinically acceptable accuracy. Methods: Our systematic search was conducted on 11 January 2026, in Medline, EMBASE, and Cochrane Central Register of Controlled Trials without restrictions or filters. Studies comparing AI diagnostics with histopathological diagnostics for odontogenic cysts and ABs were included. Diagnostic parameters, including sensitivity, specificity, and accuracy, were extracted and analyzed; additionally, diagnostic odds ratios were calculated. Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Recommendations of the GRADE workgroup were followed to determine the certainty of evidence. Results: Thirteen articles were found eligible, of which seven were included in our meta-analysis. The group with the highest sensitivity (Se) was the "no lesion" (N) group (0.9726, 95% CI 0.9284-1; I2 = 46%), followed by the radicular cyst (RC) (mean 0.9054, 95% CI 0.8051-1; I2 = 89%), dentigerous cyst (DC) (mean 0.8788, 95% CI 0.7828-0.9749; I2 = 93%), odontogenic keratocyst (OKC) (0.763, 95% CI 0.6999-0.8262; I2 = 14%) and AB (mean 0.4369, 95% CI 0.231-0.6429; I2 = 79%) groups. Results for AB, RC, and DC were statistically significant. The AB achieved the highest specificity (Sp) (mean 0.9889, 95% CI 0.9736-1; I2 = 0%), followed by RC (mean 0.9724, 95% CI 0.9431-1; I2 = 79%), DC (mean 0.9516, 95% CI 0.9116 0.9917; I2 = 90%), N (mean 0.9226, 95% CI 0.8385-1; I2 = 95%) and OKC (mean 0.8991, 95% CI 0.8683-0.9298; I2 = 8%) groups. DC, N, and RC had statistically significant results. Diagnostic odds ratios (DOR) showed that classification was better than chance for all lesion types. Conclusions: AI demonstrated high specificity, and is therefore effective in identifying healthy individuals. However, its sensitivity in detecting diseased patients remains suboptimal and requires further improvement.