DeepSeek-assisted LI-RADS classification: AI-driven precision in hepatocellular carcinoma diagnosis

DeepSeek辅助的LI-RADS分类：人工智能驱动的肝细胞癌诊断精准化

阅读：1

作者：Zhang,Jun,Liu,Jinpeng,Guo,Mingyang,Zhang,Xin,Xiao,Wenbo,Chen,Feng

期刊：	International Journal of Surgery	影响因子：	10.100
时间：	2025	起止号：	2025 Sep 1;111(9):5970-5979
doi：	10.1097/JS9.0000000000002763	研究方向：	细胞生物学、肿瘤

Abstract

BACKGROUND: The clinical utility of the DeepSeek-V3 (DSV3) model in enhancing the accuracy of Liver Imaging Reporting and Data System (LI-RADS, LR) classification remains underexplored. This study aimed to evaluate the diagnostic performance of DSV3 in LR classifications compared to radiologists with varying levels of experience and to assess its potential as a decision-support tool in clinical practice. MATERIALS AND METHODS: A dual-phase retrospective-prospective study analyzed 426 liver lesions (300 retrospective, 126 prospective) in high-risk hepatocellular carcinoma (HCC) patients who underwent magnetic resonance imaging or computed tomography. Three radiologists (one junior, two seniors) independently classified lesions using LR v2018 criteria, while DSV3 analyzed unstructured radiology reports to generate corresponding classifications. In the prospective cohort, DSV3 processed inputs in both Chinese and English to evaluate language impact. Performance was compared using chi-square test or Fisher's exact test, with pathology as the gold standard. RESULTS: In the retrospective cohort, DSV3 significantly outperformed junior radiologists in diagnostically challenging categories: LR-3 (17.8% vs. 39.7%, P < 0.05), LR-4 (80.4% vs. 46.2%, P < 0.05), and LR-5 (86.2% vs. 66.7%, P < 0.05), while showing comparable accuracy in LR-1 (90.8% vs. 88.7%), LR-2 (11.9% vs. 25.6%), and LR-M (79.5% vs. 62.1%) classifications (all P > 0.05). Prospective validation confirmed these findings, with DSV3 demonstrating superior performance for LR-3 (13.3% vs. 60.0%), LR-4 (93.3% vs. 66.7%), and LR-5 (93.5% vs. 67.7%) compared to junior radiologists (all P < 0.05). Notably, DSV3 achieved diagnostic parity with senior radiologists across all categories ( P > 0.05) and maintained consistent performance between Chinese and English inputs. CONCLUSION: The DSV3 model effectively improves diagnostic accuracy of LR-3 to LR-5 classifications among junior radiologists. Its language-independent performance and ability to match senior-level expertise suggest strong potential for clinical implementation to standardize HCC diagnosis and optimize treatment decisions.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。