Abstract
INTRODUCTION: Standardized medical examinations, used to assess trainee clinical competencies, provide a rigorous means to verify LLM accuracy and reliability in medical contexts. Although current evaluations use these exams to test LLMs' clinical reasoning, significant performance variations occur across different clinical scenarios. Existing methods struggle to adapt to evolving research needs. This study synthesizes prior research on LLMs in medical exams, highlighting current limitations and proposing future research directions. METHODS AND ANALYSIS: The formulation of the protocol was guided by the standards set forth in the JBI Manual for Evidence Synthesis. Following the establishment of precise inclusion/exclusion criteria and search strategies, we will execute systematic searches in the PubMed and Web of Science Core Collection databases. The method encompasses literature review, data extraction, analytical frameworks, and process mapping. By employing this method, researchers maintain methodological rigor during the entire research process. ETHICS AND DISSEMINATION: This protocol describes a method for performing a scoping review. The investigation focuses on the organized synthesis and examination of previously published research. It does not include human/animal experimentation or sensitive data collection. Ethical approval is not required for this literature-based study.