Abstract
OBJECTIVE: Gliomas, the most aggressive type of brain tumor, are infamous for their low survival rates. Tumor grading and isocitrate dehydrogenase (IDH) status are key prognostic biomarkers for gliomas. However, obtaining these markers typically requires invasive methods such as biopsy. As an effective, noninvasive alternative, multimodal MRI can reveal tumor spatial information and the microenvironment. Low-grade and IDH-mutant gliomas often exhibit T2-FLAIR mismatch signals. Medical image foundational models can explore complex representations in medical images, and fine-tuning them may further enhance glioma diagnosis. METHODS: We propose a multi-task network, MTSAM, for simultaneous glioma IDH genotyping and grading. MTSAM first uses dilated convolutions to simulate large-field convolutions and then reviews the T2 and FLAIR images. Then, we employ convolutions to perform a detailed exploration of the T2 and FLAIR images, and we subtract the weighted T2 and FLAIR images to obtain T2-FLAIR mismatch features. T2-FLAIR mismatch features are concatenated with multimodal MRIs and input into the customized SAM-Med3D. The customized SAM-Med3D is fine-tuned by leveraging complementary information across multi-view modalities, including MRIs, handcrafted radiomics (HCR), and clinical features. Then it extracts deep features for accurate IDH genotyping and grading. RESULTS: MTSAM achieves AUCs of 92.38 and 94.31% for glioma IDH typing and grading on the UCSF-PDGM dataset, respectively, and AUCs of 91.56 and 93.37% on the BraTS2020 dataset, outperforming other methods. Additionally, we use Grad-CAM to visualize the attention maps of MTSAM, demonstrating its potential for non-invasive glioma diagnosis. CONCLUSION: The proposed method demonstrates that we can effectively fuse multi-view, non-invasive information and fully explore the knowledge learned by medical image foundational models from large-scale medical datasets to facilitate glioma diagnosis, thereby advancing glioma research.