Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record's Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study

利用人工智能和自然语言处理技术,结合电子健康记录的结构化数据和自由文本数据来识别非瓣膜性房颤,从而降低中风和死亡风险:评估和病例对照研究

阅读:1

Abstract

BACKGROUND: Nonvalvular atrial fibrillation (NVAF) affects almost 6 million Americans and is a major contributor to stroke but is significantly undiagnosed and undertreated despite explicit guidelines for oral anticoagulation. OBJECTIVE: The aim of this study is to investigate whether the use of semisupervised natural language processing (NLP) of electronic health record's (EHR) free-text information combined with structured EHR data improves NVAF discovery and treatment and perhaps offers a method to prevent thousands of deaths and save billions of dollars. METHODS: We abstracted 96,681 participants from the University of Buffalo faculty practice's EHR. NLP was used to index the notes and compare the ability to identify NVAF, congestive heart failure, hypertension, age ≥75 years, diabetes mellitus, stroke or transient ischemic attack, vascular disease, age 65 to 74 years, sex category (CHA(2)DS(2)-VASc), and Hypertension, Abnormal liver/renal function, Stroke history, Bleeding history or predisposition, Labile INR, Elderly, Drug/alcohol usage (HAS-BLED) scores using unstructured data (International Classification of Diseases codes) versus structured and unstructured data from clinical notes. In addition, we analyzed data from 63,296,120 participants in the Optum and Truven databases to determine the NVAF frequency, rates of CHA(2)DS(2)‑VASc ≥2, and no contraindications to oral anticoagulants, rates of stroke and death in the untreated population, and first year's costs after stroke. RESULTS: The structured-plus-unstructured method would have identified 3,976,056 additional true NVAF cases (P<.001) and improved sensitivity for CHA(2)DS(2)-VASc and HAS-BLED scores compared with the structured data alone (P=.002 and P<.001, respectively), causing a 32.1% improvement. For the United States, this method would prevent an estimated 176,537 strokes, save 10,575 lives, and save >US $13.5 billion. CONCLUSIONS: Artificial intelligence-informed bio-surveillance combining NLP of free-text information with structured EHR data improves data completeness, prevents thousands of strokes, and saves lives and funds. This method is applicable to many disorders with profound public health consequences.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。