Validation of 13 102 International Classification of Diseases, Tenth Revision, Clinical Modification codes using a large language model-based system

使用基于大型语言模型的系统验证 13102 个国际疾病分类第十版临床修订版代码

阅读:1

Abstract

OBJECTIVES: To comprehensively evaluate the validity of International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) codes for both prevalent diagnoses and less common diseases, and to assess the performance of a large language model (LLM)-based system in validating these codes. MATERIALS AND METHODS: This retrospective study analyzed hospital admissions from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. We developed a validated LLM-based system using GPT-4o, refined through iterative prompt engineering, to assess ICD-10-CM code validity. We measured the positive predictive value (PPV) of ICD-10-CM codes, PPV of principal and secondary diagnoses, and the performance of an LLM-based system in code validation. RESULTS: Among 865 079 assigned codes, the PPV was 84.6% (95% CI, 84.5%-84.6%). Principal diagnoses had a PPV of 93.9% (95% CI, 93.7%-94.1%), while secondary diagnoses had a PPV of 83.8% (95% CI, 83.7%-83.9%). The LLM system demonstrated high performance in validating ICD codes, achieving 93.6% accuracy, 95.4% sensitivity, and 85.2% specificity. Among correctly assigned secondary diagnoses, the majority (67.9%) represented historical or baseline conditions, while 32.1% reflected active conditions that deviated from baseline status; 22.3% of these emerged after hospital admission. PPV decreases with later diagnosis positions, with the largest decline occurring between principal and secondary diagnoses. DISCUSSION AND CONCLUSION: In this large-scale evaluation, ICD-10-CM codes exhibited generally high accuracy, though variability existed by position and condition type. A validated LLM system performed comparably to physician review and offers a scalable means to improve coding accuracy. These findings support the potential for integrating LLM-based auditing into routine workflows to strengthen the quality of administrative and research data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。