Explainable AI-Driven Analysis of Radiology Reports Using Text and Image Data: Experimental Study

基于文本和图像数据的放射学报告可解释人工智能驱动分析：实验研究

阅读：3

作者：Zamir,Muhammad Tayyab,Khan,Safir Ullah,Gelbukh,Alexander,Felipe Riverón,Edgardo Manuel,Gelbukh,Irina

期刊：	JMIR Formative Research	影响因子：	2.100
时间：	2025	起止号：	2025 Oct 14;9:e77482
doi：	10.2196/77482

Abstract

BACKGROUND: Artificial intelligence (AI) is increasingly being integrated into clinical diagnostics; yet, its lack of transparency hinders trust and adoption among health care professionals. The explainable artificial intelligence (XAI) has the potential to improve the interpretability and reliability of AI-based decisions in clinical practice. OBJECTIVE: This study evaluates the use of XAI for interpreting radiology reports to improve health care practitioners' confidence and comprehension of AI-assisted diagnostics. METHODS: This study used the Indiana University chest x-ray dataset containing 3169 textual reports and 6471 images. Textual data were being classified as either normal or abnormal by using a range of machine learning approaches. This includes traditional machine learning models and ensemble methods, deep learning models (long short-term memory network), and advanced transformer-based language models (GPT-2, T5, LLaMA-2, and LLaMA-3.1). For image-based classifications, convolutional neural networks, including DenseNet121 and DenseNet169, were used. Top-performing models were interpreted using XAI methods SHAP (Shapley Adaptive Explanations) and Local Interpretable Model-Agnostic Explanations to support clinical decision making by enhancing transparency and trust in model predictions. RESULTS: The LLaMA-3.1 model achieved the highest accuracy of 98% in classifying the textual radiology reports. Statistical analysis confirmed the model's robustness, with Cohen κ (k=0.981) indicating near-perfect agreement beyond chance. Both the chi-square and Fisher exact tests revealed a highly significant association between the actual and predicted labels (P<.001). Although the McNemar Test yielding a nonsignificant result (P=.25) suggests a balanced class performance, the highest accuracy of 84% was achieved in the analysis of imaging data using the DenseNet169 and DenseNet121 models. To assess explainability, Local Interpretable Model-Agnostic Explanations and SHAP were applied to the best-performing models. These models consistently highlighted that the medical-related terms such as "opacity," "consolidation," and "pleural" are clear indications for abnormal findings in textual reports. CONCLUSIONS: The research underscores that explainability is an essential component of any AI systems used in diagnostics and is helpful in the design and implementation of AI in the health care sector. Such an approach improves the accuracy of the diagnosis and builds confidence in health workers, who in the future will use XAI in clinical settings, particularly in the application of AI explainability for medical purposes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。

肿瘤免疫

炎症

T细胞

线粒体

凋亡

转录调控

巨噬细胞

自噬

传染病

氧化应激

肠道菌群

磷酸化

血管生成

囊泡

3D/类器官

单细胞

中性粒细胞

外泌体

DNA甲基化

miRNA

药物研究

铁死亡

细胞衰老

乙酰化

缺氧低氧

泛素化

树突状细胞

炎性小体

组蛋白修饰

肿瘤微环境

lncRNA

代谢重编程

焦亡

m6A/m5C/m7G

内质网应激

空间多组学

细胞基因治疗

治疗耐药

相分离

Treg

上皮间质转化

免疫代谢

染色质重塑

脂质过氧化

蛋白质稳态

脂代谢

细胞极性

铁代谢

氨基酸代谢

碱基编辑

cGAS-STING

肠脑轴

蛋白降解

乳酸化

翻译调控

circRNA

piRNA

肿瘤异质性

NK 细胞

氧化脂质

MDSC

NETosis

低氧缺氧

溶酶体功能

琥珀酰化

细胞干性

CAR-NK

冷应激

RNA 编辑

Tfh

巴豆酰化

器官芯片

表观遗传记忆

铜死亡

器官纤维化

线粒体未折叠蛋白反应

空间代谢组

程序性坏死

自噬流

MAIT 细胞

肠肝轴

丙酰化