Abstract
OBJECTIVE: To compare the reliability and readability of responses from Generative Pre-trained Transformer versions 3.5 (GPT-3.5) and 4.0 (GPT-4.0) on traumatic brain injury (TBI) topics against Model Systems Knowledge Translation Center (MSKTC) fact sheets. METHODS: This study analyzed responses from GPT-3.5 and GPT-4.0 for accuracy, comprehensiveness, and readability against MSKTC fact sheets, incorporating a correlation analysis between reliability and readability scores. RESULTS: Findings showed an improvement in reliability from GPT-3.5 (mean score = 3.21) to GPT-4.0 (mean score = 3.63), indicating better accuracy and completeness in the latter. Despite advancements, responses generally remained accurate but not fully comprehensive. Readability comparisons found the MSKTC fact sheets were significantly more reader-friendly compared to responses from both artificial intelligence (AI) versions, with no strong correlation between reliability and readability. CONCLUSION: The study highlights progress in AI-generated information on TBI from GPT-3.5 to GPT-4.0 in terms of reliability. However, challenges persist in matching the readability of standard patient education materials, emphasizing the need for future AI developments to focus on enhancing understandability alongside accuracy.