Abstract
Psychotherapy process measures like the Experiencing Scale (EXP) offer valuable insight into clinical interactions but are time-intensive to code. Large language models (LLMs) like ChatGPT have the potential to streamline this process, but empirical validation is nascent. This exploratory study aimed to provide a proof-of-concept coding the EXP using ChatGPT with special attention to ethical considerations, limitations, and future directions. ChatGPT was used to code 79 psychotherapy transcripts drawn from the EXP manual. Multiple models of ChatGPT were tested using varied few-shot learning prompt engineering protocols. Data collection occurred in three phases, during which models rated both modal and peak EXP scores for all transcripts. ChatGPT demonstrated moderate agreement with manual reference ratings. An efficient configuration (o3-mini, 5-shot prompting) yielded moderate reliability for both modal EXP scores (ICC[3,1] = .67, 95% CI [.53, .79]) and peak EXP scores (ICC[3,1] = .71, 95% CI [.58, .81]). LLMs may feasibly augment or replace human EXP coders under certain conditions. However, evidence is preliminary and ethical and technical limitations remain. Future research should validate the present methodology using out-of-manual data, assess potential pretraining exposure, and explore locally hosted LLM applications to mitigate privacy concerns.