Abstract
Conventional Garden classification of femoral-neck fractures relies on radiography or CT, but image quality variations, indistinct fracture lines, and inter-observer differences often cause misclassification-especially for Garden I/II fractures-while fully automated classification remains unexplored. This retrospective multicenter study (2018-2024) included 10,010 hip images from 806 patients across four Chinese hospitals: 7,818 images (529 patients) for model training/internal validation (five-fold cross-validation) and 2,192 images (277 patients) for external robustness testing, with comparisons against 12 physicians of varying experience. Performance was assessed via sensitivity, specificity, accuracy, AUC, and other metrics, alongside heat-map interpretability. Five-fold cross-validation yielded 93.34% mean accuracy and 95.29% specificity, with 95.78% mean AUC on the independent test set; the model markedly improved resident physicians' diagnostic accuracy, narrowing gaps with senior clinicians. This deep-learning model enables accurate automatic femoral-neck fracture localization and Garden classification, showing promise for clinical decision support, while prospective randomized studies are needed to confirm its utility.