Abstract
BACKGROUND: Innovative, scalable mental health tools are needed to address systemic provider shortages and accessibility barriers. Large language model-based tools can provide real-time, tailored feedback to help users engage in cognitive reappraisal outside traditional therapy sessions. Socrates 2.0 (Rush University Medical Center) is a multiagent artificial intelligence tool that guides users through Socratic dialogue. OBJECTIVE: The study aimed to examine the feasibility, acceptability, and potential for symptom reduction of Socrates 2.0. METHODS: A total of 61 adult participants enrolled in a 4-week mixed methods preclinical feasibility study. The participants used Socrates 2.0 as desired and completed the self-report measures of depression, social anxiety, posttraumatic stress, and obsessive-compulsive symptoms at baseline and 1-month follow-up. Feasibility, acceptability, and appropriateness, along with usability and working alliance, were assessed via validated measures. The semistructured interviews explored user experiences and perceptions. RESULTS: Participants engaged with Socrates 2.0 an average of 6.70 (SD 4.57) times over 4 weeks. Feasibility (mean 4.26, SD 0.67), acceptability (mean 4.16, SD 0.84), and usability ratings were high. Participants reported small-to-moderate reductions in depression (effect size d=0.30), social anxiety (d=0.25), obsessive-compulsive (d=0.33), and posttraumatic stress (d=0.28) symptoms. Working alliance scores suggested a moderately strong perceived bond with the artificial intelligence tool. Qualitative feedback indicated that the nonjudgmental, on-demand nature of Socrates 2.0 encouraged self-reflection and exploration. Some users critiqued the repeated questioning style and limited conversation depth. CONCLUSIONS: Socrates 2.0 was perceived as feasible, acceptable, and moderately helpful for self-guided cognitive reappraisal, demonstrating potential as an adjunct to traditional therapy. Further research, including randomized trials, is needed to determine effectiveness across different populations, optimize personalization, and address the repetitive conversational nature.