What emotions reveal about patient safety: GPT-4-based sentiment and emotion analysis of 11056 German CIRS medical reports (2005-2024)

情绪如何揭示患者安全问题:基于 GPT-4 的 11056 份德国 CIRS 医疗报告(2005-2024 年)的情感分析

阅读:1

Abstract

OBJECTIVES: Critical incident reporting systems (CIRS) collect narrative reports on medical errors, but emotional signals within these reports, potential indicators of perceived risk and systemic weakness, are rarely examined. This cross-sectional study applied large language model-based sentiment analysis to explore how emotional expression in CIRS data may support artificial intelligence-enhanced patient safety monitoring. METHODS: We analysed 11 056 anonymised German incident reports submitted between 2005 and 2024 using GPT-4 (Generative Pre-trained Transformer 4, gpt-4-turbo-2024-04-09, zero shot) to assign sentiment labels and quantify five emotions (fear, frustration, anger, sadness, guilt; scale 0-1). Emotional profiles were clustered (k-means) and thematic patterns extracted via Latent Dirichlet allocation. Associations were examined using non-parametric tests. RESULTS: Negative sentiment dominated (95.6%, 95% CI 94.9% to 96.2%). Fear (mean=0.63, SD=0.21) and frustration (mean=0.59, SD=0.19) were most prevalent. Emergency care settings showed higher fear (p<0.05) and guilt (p<0.001). Reports with strong emotional expression, especially fear, guilt and sadness, were less likely to receive formal feedback (43.1% (95% CI 41.7% to 44.5%) vs 48.1% (95% CI 46.5% to 49.7%); absolute difference=5.0 percentage points (95% CI 2.7 to 7.3); p=0.001). DISCUSSION: Emotion intensity did not consistently correlate with harm severity but was linked to care context and systemic complexity. Emotion clusters reflected distinct clinical and organisational patterns, from acute emergencies to procedural failures. CONCLUSION: Emotion-based analysis of incident reports provides insight into perceived burden and care context. Sentiment profiling may improve system interpretability and support emotion-sensitive safety culture and feedback. Leveraging large language models can reduce reviewer workload and enable more targeted triage of emotionally complex reports.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。