Abstract
To develop a comprehensive deep learning and radiomics-based multi-task pipeline for the detection and classification of key fetal anatomical structures in first-trimester ultrasound images, using a diverse multi-center dataset to ensure high variability, reproducibility, and generalizability. A total of 4,532 fetal ultrasound scans (gestational age 11–14 weeks), retrospectively collected from nine medical centers, were included in this study. Two detection models, You Only Look Once version 11 (YOLOv11) and shifted window transformer (Swin Transformer), were trained to localize nine fetal brain and craniofacial structures. From each detected region, 215 radiomic features and 1,792 deep features were extracted. Feature stability was ensured through intra-class correlation coefficient (ICC) filtering (threshold ≥ 0.75), Pearson correlation analysis, and least absolute shrinkage and selection operator (LASSO) regression. The selected features were then used to train a Transformer-based model for tabular data (TabTransformer) to classify fetal anatomical structures into clinically defined categories based on their sonographic appearance. Model performance was evaluated using Accuracy, area under the receiver operating characteristic curve (AUC), and Sensitivity across training, internal validation, and external test datasets. Fusion models integrating radiomic and deep features consistently outperformed single-modality models in both detection and classification. On the external test set, classification accuracy reached 96.1%, with AUCs up to 96.89%, and sensitivity exceeding 95% for key anatomical structures. Swin Transformer achieved superior localization performance compared to YOLOv11, with Intersection over Union (IoU) values up to 0.97 and F1-scores ≥ 0.94. Feature reproducibility remained above 75% across centers. The TabTransformer classifier demonstrated strong generalization and robustness, effectively leveraging the fused feature space for high-precision classification. This study presents the fully integrated, multi-task framework for fetal anatomical structure detection and classification using multi-center ultrasound data. The proposed approach demonstrates high reproducibility and diagnostic performance, offering strong clinical potential for early and objective fetal anomaly screening in the first trimester. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-41635-8.