Abstract
BACKGROUND: E-medicine use has surged, and health systems are exploring LLMs for message triage. However, it is still unknown whether patient tone alone alters AI-generated clinical or administrative decisions. METHODS: We created 1,000 clinician-validated primary-care vignettes (500 clinical, 500 sick-leave) and presented each in eight communication styles. Five agentic LLMs generated structured outputs for triage urgency, sick-leave decisions and other outputs.Differences from the neutral control were assessed using chi-square tests (Cramér's V) and t-tests (Cohen's d), with FDR correction. As external validation, 40 real patient e-messages from a large health network were processed using the same pipeline. RESULTS: Across 120,000 agent runs, patient tone produced clear and reproducible shifts. Urgent, threatening, and demanding framings increased same-day or urgent care from 14% to 37-63% (V up to 0.69, P<0.001). Medication advice shifted modestly toward prescription options (Rx from 5% to 7-9%, p <0.001). Emotional tone increased empathy-based responses from 62% to 70-86% (p <0.001). In sick-leave tasks, threatening tone reduced approvals (58% to 50%) and granted days (2.60 to 2.36; d = -0.12), while emotional tone slightly increased both.The real-world validation showed the same directional effects. It confirmed that tone influenced model outputs even in authentic messages. CONCLUSIONS: Agentic LLMs treated patient tone as clinical input, altering triage, follow-up, prescribing, and sick-leave decisions despite identical symptoms. These tone-sensitive shifts may introduce hidden biases, affect resource use, and enable misuse in E-medicine workflows.