Abstract
RESEARCH QUESTION: Can artificial intelligence (AI) standardize embryo scoring, and help embryologists to identify embryos with the highest likelihood of pregnancy and live birth? DESIGN: Multicentre, retrospective, head-to-head analysis across six centres in five countries. An embryo selection algorithm (ESA) and 20 embryologists of varying seniority independently selected the implanting (i.e. 'best') embryo from 1681 pairs (1237 pairs with biochemical pregnancy; 444 pairs with live births), with each pair comprising one embryo with a positive outcome and one embryo with a negative outcome. Accuracy was computed for the ESA and for the embryologists; differences were assessed using McNemar's test. RESULTS: The accuracy of the ESA was 70.1%. The accuracy of individual embryologists ranged from 64.2% to 68.9% (mean value for embryologists 67.7%), and the accuracy of the expert committee (i.e. majority vote across the 20 embryologists) was 69.5%. McNemar's test indicated a significant advantage for the ESA compared with 14 of 20 embryologists, and the mean value for embryologists (P < 0.05), but no significant difference between the ESA and the remaining six embryologists or the expert committee. CONCLUSIONS: The ESA achieved higher accuracy than most individual embryologists and the mean value for embryologists, supporting its potential as a standardized adjunct to expert judgement. Confirmation of effectiveness and generalizability requires adequately powered, prospective multicentre trials.