Abstract
Contemporary personality assessment relies heavily on psychometric scales, which offer efficiency but risk oversimplifying the rich and contextual nature of personality. Recognizing these limitations, this study explores the use of commercially available generative large language models (LLMs), such as ChatGPT, Claude and so on, to assess personality traits from open-ended qualitative narratives. Across two distinct samples and methodologies (spontaneous streams of thought and daily video diaries), we used seven commercial, generative LLMs to score Big-Five personality traits, achieving convergence with self-report measures comparable to or exceeding established benchmarks (for example, self-other agreement, ecological momentary assessment, and bespoke machine learning models). Although results differed across different LLMs, we found that using the average LLM score across models provided the strongest agreement with self-report. Further, LLM-generated trait scores also demonstrated predictive validity regarding daily behaviours and mental health outcomes. This LLM-based approach achieved quantitative rigour based on qualitative data and is easily accessible without specialized training. Importantly, our findings also reaffirm that personality is expressed ubiquitously, in that it is carried in the stream of our thoughts and is woven into the fabric of our daily lives. These results encourage broader adoption of generative LLMs for psychological assessment and-given the new generation of tools-stress the value of idiographic narratives as reliable sources of psychological insight.