Evaluation of large language models in medical examinations: A scoping review protocol

大型语言模型在医学考试中的评估:范围界定审查方案

阅读:2

Abstract

INTRODUCTION: Standardized medical examinations, used to assess trainee clinical competencies, provide a rigorous means to verify LLM accuracy and reliability in medical contexts. Although current evaluations use these exams to test LLMs' clinical reasoning, significant performance variations occur across different clinical scenarios. Existing methods struggle to adapt to evolving research needs. This study synthesizes prior research on LLMs in medical exams, highlighting current limitations and proposing future research directions. METHODS AND ANALYSIS: The formulation of the protocol was guided by the standards set forth in the JBI Manual for Evidence Synthesis. Following the establishment of precise inclusion/exclusion criteria and search strategies, we will execute systematic searches in the PubMed and Web of Science Core Collection databases. The method encompasses literature review, data extraction, analytical frameworks, and process mapping. By employing this method, researchers maintain methodological rigor during the entire research process. ETHICS AND DISSEMINATION: This protocol describes a method for performing a scoping review. The investigation focuses on the organized synthesis and examination of previously published research. It does not include human/animal experimentation or sensitive data collection. Ethical approval is not required for this literature-based study.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。