Artificial Intelligence-Based Automated Interpretation of Images of Electrocardiograms: Development and Multinational Validation of ECG-GPT

基于人工智能的心电图图像自动判读:ECG-GPT的开发及多国验证

阅读:1

Abstract

BACKGROUND: Timely and accurate assessment of electrocardiograms (ECGs) is crucial for diagnosing, triaging, and clinically managing patients. Current workflows rely on computerized ECG interpretation tools built into ECG signal acquisition systems, which use rule-based algorithms that are unreliable and frequently not available in low-resource settings. We developed and validated a format-independent vision encoder-decoder model - ECG-GPT - that can generate free-text, expert-level interpretations directly from 12-lead ECG images. METHODS: Using 12-lead ECGs and their corresponding diagnosis statements collected at the Yale-New Haven Health System (YNHHS) between 2000 and 2022, we developed a vision-text transformer model to generate interpretation statements from images of ECGs. Using structured clinical assessment, semantic similarity, and conventional natural language generation metrics, we validated ECG-GPT across 7 geographically distinct health settings. These include (1) 3 large and diverse US health systems, (2) consecutive ECGs from a central reading system in Minas Gerais, Brazil, (3) the prospective cohort study, UK Biobank, (4) a Germany-based, publicly available repository, PTB-XL, and (5) a community hospital in Missouri. RESULTS: Overall, 2.9 million ECGs were used for model development. The model performed well in clinical assessment across 26 extracted labels: for atrial fibrillation, sinus tachycardia, sinus bradycardia, premature atrial contractions, and premature ventricular contractions, AUROCs and AUPRCs ranged from 0.80-0.95 and 0.50-0.86, respectively. For left bundle branch block, right bundle branch block, first degree atrioventricular block, left anterior fascicular block, and left posterior fascicular block, AUROCs and AUPRCs ranged from 0.88-0.96 and 0.23-0.86, respectively. Across all 26 conditions, diagnostic accuracy ranged between 0.93-0.99. ECG-GPT identified the full context of the diagnosis statements with allied conditions. It had a median pairwise cosine similarity of 0.90 (IQR 0.83-0.97), significantly greater than the median baseline similarity of 0.73 (IQR 0.67-0.78, p<0.001). This separation between median pairwise and baseline similarity remained consistent across all 26 condition-specific subsets. The results were comparable across external validation sites. CONCLUSIONS: We developed and extensively validated a vision encoder-decoder model that generates expert-level interpretations from ECG images. This represents a scalable and accessible strategy for automated ECG analysis, especially in low-resource settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。