Decomposing Persona Prompts for Simulated Clinical Reasoning: A Two-by-Two Factorial In Silico Experiment of Time Pressure and Safety Prioritization

分解模拟临床推理中的角色提示:一项关于时间压力和安全优先级的二乘二析因计算机实验

阅读:1

Abstract

BACKGROUND: Persona prompting is widely used to steer large language models (LLMs), but its effects on safety-critical clinical reasoning are not well characterized. METHODS: We performed a two-by-two factorial in silico experiment crossing time-pressure framing (high versus low) with optimization target (safety-first versus lean-efficiency). We used 28 Japanese-language synthetic emergency department vignettes covering chest pain, abdominal pain, headache, and dyspnea. Four trap cases contained prespecified contraindication or sequencing rules. Each persona evaluated each vignette twice, yielding 224 independent runs. Outputs followed a fixed JavaScript Object Notation (JSON) schema and were scored for the number of proposed tests, entropy of the probability distribution across the top five differential diagnoses, discharge decisions, safety-net specificity, and contraindication or sequencing violations, with severity grading. RESULTS: High time-pressure framing reduced the number of proposed tests (beta = -1.05, p < 0.001) and diagnostic breadth (beta = -0.246, p < 0.001). Safety-first prompting increased proposed testing (beta = 1.32, p < 0.001) and diagnostic breadth (beta = 0.247, p < 0.001), with no significant interaction. Among discharge plans (36 of 224 runs), safety-first prompting improved safety-net specificity (mean 4.5 versus 2.6 on a five-point scale). Contraindication or sequencing violations occurred only in the high/lean condition (eight of 56 runs, 14.3%); in trap cases, violations were eight of eight under high/lean and zero of 24 in the other three conditions. CONCLUSIONS: Persona components predictably shifted simulated clinical reasoning. Time-pressure framing narrowed diagnostic search and reduced proposed testing, whereas safety-first prompting improved safety-netting and prevented severe trap-case violations outside the high/lean condition. Prompt-aware stress testing may help identify unsafe prompt configurations before clinical deployment.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。