Transforming free-text coronary angiography reports into structured, analyzable data using large language models

利用大型语言模型将自由文本形式的冠状动脉造影报告转换为结构化的、可分析的数据

阅读:2

Abstract

Coronary angiography (CAG) reports contain many details about coronary anatomy, lesion characteristics, and interventional procedures. However, their free-text format limits their research utility. Therefore, we sought to develop and validate a framework leveraging large language models (LLMs) to convert CAG reports automatically into a standardized structured format. Using 50 CAG reports from a tertiary hospital, we developed a multi-step framework to standardize and extract key information from CAG reports. First, a standard annotation schema was developed by cardiologists. Thereafter, an LLM (GPT-4o) converted the free-text CAG reports into the hierarchical annotation schema in a standardized format. Finally, clinically relevant information was extracted from the standardized schema. One hundred CAG reports from each of two hospitals were used for internal and external test, respectively. The 12 key information points included four CAG-related (previous stent information, lesion characteristics, and anatomical diagnosis) and eight percutaneous coronary intervention (PCI)-related key points (complex PCI criteria and current stent information). For internal test, two interventional cardiologists independently extracted information, with discrepancies resolved through consensus, as reference standard. Based on the reference standard, the proposed framework demonstrated superior accuracy for CAG-related (99.5% vs. 91.8%; p < 0.001) and comparable accuracy for PCI-related key points (98.3% vs. 97.4%; p = 0.512) in the internal test. External test confirmed high accuracy for both CAG- (96.2%) and PCI-related key points (99.4%). This framework demonstrated excellent accuracy in standardizing free-text CAG reports, potentially enabling more efficient utilization of detailed clinical data for cardiovascular research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-025-32150-3.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。