Abstract
INTRODUCTION: Systematic reviews (SRs) require comprehensive, reproducible searches, yet developing search strategies is resource-intensive and demands specialized expertise. Generative AI offers potential to streamline this process, but empirical evaluations for GAI-assisted SR searching remain scarce. The objectives of this study are to: demonstrate a step-by-step process for developing a custom ChatGPT-based chatbot to support SR search strategy development, and evaluate its performance. DESIGN: A cross-sectional evaluation study. METHODS: We used ChatGPT-4.0 to create a chatbot designed to mimic a medical librarian, generating PICO-informed searches. Its knowledge base was augmented with two methodological references. After piloting testing, we refined its instructions. For evaluation, we randomly sampled 50 Cochrane SRs published in 2024. Standardized P-I-O prompts produced database-ready queries for PUBMED and EMBASE. The primary outcome was per-review success rate, summarized by median and inter-quartile range. A sensitivity analysis was conducted. RESULTS: Pilot testing achieved a retrieval rate of 41/49 (83.7%). In the main sample (1169 studies; median 13.5 studies per SR), the chatbot identified a median of 67.4% of included studies (IQR: 43.1%-88.4%). When limited to indexed studies (n = 1114), retrieval rose to 72.0% (IQR: 46.0%-92.5%). Lower performance was observed when outcomes were absent from the abstracts or interventions had many lexical variants. CONCLUSIONS: A GAI-based chatbot can rapidly generate SR searches (~67%-72% identification), serving as a useful starting point but not a replacement for expert-led approaches. Integration of librarian expertise, structured prompts, and controlled vocabularies may improve performance. Further benchmarking and transparent reporting are needed to guide adoption.