Abstract
OBJECTIVE: To evaluate the adherence of large language model (LLM)-based healthcare research to the Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM) checklist, a framework designed to enhance the transparency and reproducibility of studies on the accuracy of LLMs for medical applications. MATERIALS AND METHODS: A systematic PubMed search was conducted to identify articles on LLM performance published in high-ranking clinical medicine journals (the top 10% in each of the 59 specialties according to the 2023 Journal Impact Factor) from November 30, 2022, through June 25, 2024. Data on the six MI-CLEAR-LLM checklist items: 1) identification and specification of the LLM used, 2) stochasticity handling, 3) prompt wording and syntax, 4) prompt structuring, 5) prompt testing and optimization, and 6) independence of the test data-were independently extracted by two reviewers, and adherence was calculated for each item. RESULTS: Of 159 studies, 100% (159/159) reported the name of the LLM, 96.9% (154/159) reported the version, and 91.8% (146/159) reported the manufacturer. However, only 54.1% (86/159) reported the training data cutoff date, 6.3% (10/159) documented access to web-based information, and 50.9% (81/159) provided the date of the query attempts. Clear documentation regarding stochasticity management was provided in 15.1% (24/159) of the studies. Regarding prompt details, 49.1% (78/159) provided exact prompt wording and syntax but only 34.0% (54/159) documented prompt-structuring practices. While 46.5% (74/159) of the studies detailed prompt testing, only 15.7% (25/159) explained the rationale for specific word choices. Test data independence was reported for only 13.2% (21/159) of the studies, and 56.6% (43/76) provided URLs for internet-sourced test data. CONCLUSION: Although basic LLM identification details were relatively well reported, other key aspects, including stochasticity, prompts, and test data, were frequently underreported. Enhancing adherence to the MI-CLEAR-LLM checklist will allow LLM research to achieve greater transparency and will foster more credible and reliable future studies.