A large language model based pipeline for extracting information from patient complaint and anamnesis in clinical notes for severity assessment

一个基于大型语言模型的流程,用于从临床记录中的患者主诉和病史中提取信息,以进行严重程度评估。

阅读:2

Abstract

Identifying patients with critical illness in emergency departments (EDs) is an ongoing challenge, partly due to the limited information available at the time of admission. The clinical notes in patient records have already received attention for the value of improving prediction. Recent large language models (LLMs) have demonstrated their promising performance. However, the utilization of LLMs for analyzing clinical notes has not been extensively investigated. To improve the severity assessment of illness and the prediction of triage level, we developed a pipeline for utilizing LLMs (e.g. ChatGLM-2, GLM-4 and Alpaca-2) to extract information from patient complaint and anamnesis in clinical notes. In this pipeline, a LLM is supplied with the text input including complaint and anamnesis of a patient, where the input is further constructed by a prompt template, in-context learning (ICL), and retrieval-augmented generation (RAG). Then a severity score is extracted from the LLM, which is further integrated into a predictive model for improving its performance. We demonstrated the effectiveness of our pipeline based on the patient records derived from Chinese Emergency Triage, Assessment, and Treatment (CETAT) database. The extracted score were be incorporated into logistic regression as a predictor. At early stage, as vital signs were typically not yet measured, the predictive value of patient complaint and anamnesis was illustrated (evidenced by an improvement in AUC-ROC from 0.746 to 0.802). At later stage, vital signs became available, the enhancements in prediction attributable to the score were weaker, but still was observed with statistical significance in most cases. The recent LLMs are capable of extracting valuable information from clinical notes for identifying critical illness. The effectiveness has been illustrated in our study. It is still necessary to develop more efficient methods based on LLMs in order to achieve better performance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。