Large language models management of complex medication regimens: a case-based evaluation

大型语言模型在复杂药物治疗方案管理中的应用:基于案例的评估

阅读:1

Abstract

BACKGROUND: Large language models (LLMs) have shown the ability to diagnose complex medical cases, but only limited studies have evaluated the performance of LLMs in the development of evidence-based treatment plans. The purpose of this evaluation was to test four LLMs on their ability to develop safe and efficacious treatment plans on complex patients managed in the intensive care unit (ICU). METHODS: Eight high-fidelity patient cases focusing on medication management were developed by critical care clinicians including history of present illness, laboratory values, vital signs, home medications, and current medications. Four LLMs [ChatGPT (GPT-3.5), ChatGPT (GPT-4), Claude-2, and Llama-2-70b] were prompted to develop an optimized medication regimen for each case. LLM generated medication regimens were then reviewed by a panel of seven critical care clinicians to assess safety and efficacy, as defined by medication errors identified and appropriate treatment for the clinical conditions. Appropriate treatment was measured by the average rate of clinician agreement to continue each medication in the regimen and compared using analysis of variance (ANOVA). RESULTS: Clinicians identified a median of 4.1-6.9 medication errors per recommended regimen, and life-threatening medication recommendations were present in 16.3%-57.1% of the regimens, depending on LLM. Clinicians continued LLM-recommended medications at a rate of 54.6%-67.3%, with GPT-4 having the highest rate of medication continuation among all LLMs tested (p < 0.001) and the lowest rate of life-threatening medication errors (p < 0.001). CONCLUSION: Caution is warranted using present LLMs for medication regimens given the number of medication errors that were identified in this pilot study. However, LLMs did demonstrate potential to serve as clinical decision support for the management of complex medication regimens given the need for domain specific prompting and testing.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。