Abstract
BACKGROUND: The MRI imaging traits of pediatric posterior cranial fossa neoplasms overlap highly, leading to difficulties in preoperative diagnosis. Their treatment strategies differ significantly, and traditional deep learning models have limitations in multi - sequence MRI fusion and clinical interpretability, so new solutions are urgently needed. OBJECTIVES: This study aims to develop a 2.5D multi - sequence MRI deep learning framework (ResSwinT) that integrates Residual Network and Swin Transformer, to achieve automatic classification of three main Pediatric posterior fossa tumors-Pilocytic astrocytoma (PA), Medulloblastoma (MB), and Ependymoma (EP), and enhance the interpretability of the model through the SHAP method, so as to provide a more reliable auxiliary decision-making basis for clinical practice. METHODS: This study retrospectively collected 309 pediatric patients confirmed by pathology, including 109 PA, 130 MB and 70 EP. The MRI data of these patients included five sequences: T1WI, T1C, T2WI,FLAIR, and ADC. After preprocessing steps such as N4 bias field correction, resampling, sequence registration, and intensity normalization, samples were constructed using a 2.5D image construction strategy, and the ResSwinT model is designed. Its performance was compared with seven deep learning models such as Residual Network 18 and VGG16, and SHAP analysis was used to analyze trait contributions. RESULTS: The proposed ResSwinT model outperforms existing commonly used deep learning models in all classification tasks, particularly showing outstanding performance in terms of area under the curve(AUC) and overall accuracy(ACC). For the PA vs Non-PA task: ACC 89.5%, AUC 0.975; for the MB vs Non-MB task: ACC 93.7%, AUC 0.978; for the EP vs Non-EP task: Acc 87.5%, AUC 0.937. SHapley Additive exPlanations(SHAP) analysis shows that the model pays high attention to the gross tumor volume and its surrounding structures, and its decision-making basis is highly consistent with key imaging biomarkers, verifying the interpretability and clinical relevance of the model. CONCLUSIONS: ResSwinT achieves high-precision classification of pediatric posterior fossa tumor through 2.5D multi-sequence fusion and cross-attention mechanism. SHAP attribution analysis reveals the biological basis of the model's decision-making, providing clinicians with an interpretable AI-assisted diagnostic tool, and is expected to optimize individualized treatment strategies.