Abstract
OBJECTIVES: Renal cell carcinoma (RCC) is a malignant renal tumor that poses a significant threat to patient health. Accurate preoperative pathological grading plays a crucial role in determining the appropriate treatment for this disease. Currently, deep learning technology has become an important method for pathological grading of RCC. However, existing methods primarily rely on single-phase computed tomography (CT) imaging for analysis and prediction, which has limitations such as missing small lesions, one-sided evaluation, and local focusing issues. Therefore, this study proposes a multi-modal deep learning algorithm that integrates multi-phase enhanced CT images with clinical variable data, aiming to provide a basis for predicting the pathological grading of RCC. METHODS: First, the algorithm took four-phase enhanced CT images from the plain scan, arterial phase, venous phase, and delayed phase, along with clinical variables, as inputs. Then, an embedding encoding module was used to extract heterogeneous information from the clinical variables, and a 3-dimensional (3D) ResNet50 model was employed to capture spatial information from the multi-phase enhanced CT image data. Finally, a Fusion module deeply integrated the feature information from clinical variables and each phase's CT image features, further utilizing a cross-self-attention mechanism to achieve multi-phase feature fusion. This approach comprehensively captures the deep semantic information from the patient data, fully leveraging the complementary advantages of multi-modal and multi-phase data. To validate the effectiveness of the proposed method, a total of 1 229 RCC patients were approved by ethics review were included to train the model. RESULTS: Experimental results demonstrated superior performance compared to traditional radiomics and state-of-the-art deep learning methods, achieving an accuracy of 83.87%, a recall rate of 95.04%, and an F1-score of 82.23%. CONCLUSIONS: The proposed algorithm exhibits strong stability and sensitivity, significantly enhancing the predictive performance of RCC pathological grading. It offers a novel approach for accurate RCC diagnosis and personalized treatment planning.