Abstract
This research was done to design and evaluate a graph-based deep-learning architecture to automatically multi-label breast cancer, using multiparametric magnetic resonance imaging (MRI) data and clinical features. The general aim was to enhance the diagnostic accuracy of the key clinical activities, i.e. biomarker discovery, tumour staging, and histological grading. Three-dimensional volumetric graphs were built with the help of five different MRI modalities (Phase 1, Phase 2, Phase 3, T1, and IHSFused), and each voxel and clinical variable was a node. Attributes of the nodes were radiomic features and patient-specific clinical data whereas edges represented relationships by spatial proximity and intensity. Graph structure processing was done using a Graph Isomorphism Network (GIN), which gradually optimised node embeddings by aggregating neighbourhoods. The multi-label classification focused on the HER2 status, TNM stage (Tumour Size, Node involvement, Metastasis) and histological grade elements (Tubule formation, Mitotic count, Nuclear pleomorphism). Cross-Entropy loss and Adam optimiser were used to train the model, and Synthetic Minority Over-Sampling Technique (SMOTE) was used to balance the distribution of classes. The protocol of cross-validation was used five times to be robust and performance measures were provided in terms of accuracy, precision, recall, F1-score, and confusion matrices. The t-Stochastic Neighbor Embedding (t-SNE) visualisations were used to assess the feature separability. The model achieved high accuracy in HER2 (up to 91.42%) and Tubule grade (up to 91.83%), moderate performance in Tumor Size (70–72%), Nodes (82–84%), and Metastasis (80–83%), and lower accuracy in nuclear grade (64–68%). Confusion matrices showed minimal misclassification in high-performing labels and greater variability in lower-performing ones. t-SNE plots confirmed effective feature separation for HER2 and Tubule, with overlap noted in Nuclear. These results indicate the possibilities of volumetric graph-based learning to provide full classification of breast cancer and its importance in improving precision diagnostics with the help of artificial intelligence. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s12672-025-04359-1.