Abstract
Introduction Artificial intelligence (AI) chatbots are increasingly being used to create patient education guides (PEGs). However, there are gaps in the literature comparing the latest version in terms of readability, reliability, and similarity. The aim of this study was to compare PEGs generated by ChatGPT 5.1 (OpenAI, San Francisco, California, US) and Gemini 3 Pro (Google LLC, Mountain View, CA, USA) for five common urological conditions, kidney stone, urinary tract infection, urinary retention, erectile dysfunction, and benign prostatic hyperplasia, across these domains. Methods This cross-sectional study analysed PEGs generated by both AI chatbots for five common urological conditions using identical prompts. Readability was assessed using the Flesch Reading Ease Score and Flesch-Kincaid Grade Level. Reliability and similarity were assessed using a modified DISCERN score and Turnitin, respectively. Statistical comparison was performed using the Mann-Whitney U test. Results None of the evaluated characteristics showed a statistically significant difference between the PEGs generated by AI chatbots. Conclusion PEGs generated by both AI chatbots exceeded the recommended reading level, demonstrated limited originality, and showed moderate reliability, highlighting the need for professional oversight. Continued refinement of AI chatbots is necessary before integrating AI-generated PEGs into routine patient education.