Abstract
OBJECTIVE: This study compares multiple LLMs, including ChatGPT, DeepSeek, and Llama, to generate meaningful, audience-adapted labels for the existing latent classes among patients with chronic low back pain (cLBP). METHODS: Phenotypes were derived from baseline data from two cohorts within the NIH HEAL BACPAC consortium: BACKHOME, a large nationwide e-cohort (train set: N = 3025), and COMEBACK, a deep phenotyping cohort (test set: N = 450). The analysis included pain characteristics, psychosocial factors, lifestyle habits, and social determinants of health. ChatGPT-4o (OpenAI), DeepSeek-R1, and Llama 3.3 (Meta) were applied to generate class labels for each combination of audience (clinician, patient, and caregiver), tone (formal, empathetic, and informal), and technicality (high, medium, and low). RESULTS: Latent Class Model (LCM) identified four distinct behavioral phenotypes in patients with cLBP: High Distress and Maladaptive Behaviors, Resilient and Adaptive Coping, Intermediate Maladaptive Patterns, and Emotionally Regulated with High Pain Burden. Previously validated by domain experts, these profiles served as the basis for automated labeling using three LLMs (ChatGPT-4o, DeepSeek-R1, and Llama 3.3). Using different tones and complexity levels, each model produced class labels specific to clinicians, patients, and caregivers. The generated class names for all LLMs closely matched expert-defined traits like emotional regulation, resilience, and high distress, indicating strong conceptual alignment and the capacity of LLMs to generate precise, audience-specific labels for intricate behavioral and psychological profiles. CONCLUSIONS: These results highlight the possibility of integrating LLM-driven labeling into research and clinical practice, helping to achieve more transparent knowledge translation, improved decision-making, and personalized care.