Abstract
Background/Objectives: Isocitrate dehydrogenase (IDH) mutation is a key prognostic indicator in diffuse gliomas; however, it is clinically determined from invasive tissue sampling. Non-invasive preoperative identification of IDH mutation from routine anatomical MRI could support treatment decision making. This study evaluated deep learning models for IDH mutation detection using routine anatomical MRI (post-contrast T1-weighted (T1c), T2-weighted, and fluid attenuated inversion recovery (FLAIR) MRI) and quantified how tumor-focused image preprocessing and different training schemes, centralized learning (CL) versus federated learning (FL) with alternative aggregation strategies, affected model performance. Methods: Anatomical MRI from 501 diffuse glioma patients in the UCSF Preoperative Diffuse Glioma MRI (UCSF-PDGM) dataset was analyzed using a deep learning classifier built on a 2D U-Net encoder, with age and sex included as covariates. Two methods of tumor-focused image preprocessing, Naïve Soft Filtering (NSF) and Gradient-Based Soft Filtering (GBSF), were compared. Centralized learning (CL) was benchmarked against federated learning (FL) using Federated Averaging (FA) and Federated Trimmed Mean (FTM) aggregation strategies. Model performance was compared in terms of accuracy, precision, recall, F1 score, specificity, and the area under the receiver operating characteristic curve (ROC-AUC). Results: The CL model with NSF achieved the best test performance (accuracy = 0.949, F1 = 0.951, ROC-AUC = 0.971), with NSF consistently outperforming GBSF. FL's performance decreased relative to CL's, but the FA strategy outperformed FTM (FTM accuracy = 0.915 vs. FA accuracy = 0.949), which indicates that the FL aggregation strategy has an influence on model performance. Conclusions: Deep learning applied to routine anatomical MRI could classify IDH mutation status with high accuracy. Context-preserving image preprocessing with NSF substantially improved performance across training schemes. FL provides a privacy-preserving alternative to CL, but incurs a measurable performance degradation that is sensitive to the choice of aggregation strategy.