A multimodal fusion model for bone tumor benign and malignant diagnosis: development and validation with clinical text and radiographs

用于骨肿瘤良恶性诊断的多模态融合模型:基于临床文本和X光片的开发与验证

阅读:1

Abstract

BACKGROUND: Bone tumors have diverse clinical and imaging features, rendering preoperative differentiation of benign, intermediate/malignant types challenging. Unimodal methods (medical records or X-rays) are prone to misdiagnosis/missed diagnosis due to incomplete information. While postoperative histopathology is the gold standard, there is an urgent clinical demand for a precise preoperative diagnostic tool. This study aims to develop and validate a multimodal model integrating deep learning with Dempster-Shafer (DS) evidence theory for the differential diagnosis of benign, intermediate/malignant bone tumors. Using postoperative histopathology as the reference standard, the model achieves diagnosis by integrating preoperative clinical text and radiographs. METHODS: This single-center retrospective study included 319 pathologically confirmed bone tumor patients admitted between 2020 and 2025 following selection criteria. Utilizing the patients' X-ray images and medical record text data, we constructed a fusion model based on deep learning and DS evidence theory to classify tumors into benign and intermediate/malignant categories. The performance of the model was evaluated using the receiver operating characteristic (ROC) curve along with its 95% confidence interval (CI). RESULTS: The dataset comprised text data and radiographs from a total of 319 patients and it was stratified by time into a training set, an internal validation set, and an external validation set. On the internal validation set, the fusion model achieved an area under the curve (AUC) of 0.821 (95% CI: 0.713-0.916), with an accuracy of 81.6%, precision of 81.3%, recall of 76.5% and an F1 score of 78.8%, outperforming both the unimodal text model with an AUC of 0.814 and accuracy of 77.6% and the image model with an AUC of 0.782 and accuracy of 72.4%. On the external validation set, the fusion model maintained robust performance: AUC reached 0.808 (95% CI: 0.667-0.928), accuracy 77.3%, and F1 score 70.6%. Compared to the proposed fusion approach, most baseline models underperformed across all metrics, with their accuracy ranging from 59.1% to 77.3% and F1 score ranging from 47.1% to 70.6%. Furthermore, the model's diagnostic performance rivals that of senior radiologists and significantly outperforms junior radiologists. McNemar's test results confirmed no significant difference in diagnostic performance between the model and senior radiologists, while a statistically significant performance gap existed between junior and senior radiologists. CONCLUSIONS: We have developed and validated a fusion model that integrated deep learning and DS evidence theory. In the task of distinguishing between benign and intermediate/malignant bone tumors, this fusion model demonstrated encouraging performance compared to models that utilize unimodal data and other baseline fusion models.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。