Abstract
BACKGROUND: This study aimed to evaluate the differences in test-based clinical knowledge between resident physicians who had received their undergraduate medical education in Japanese and foreign medical schools using the nationwide General Medicine In-Training Examination (GM-ITE(®)) scores and questionnaires. METHOD: We conducted a nationwide cross-sectional study of 9,106 resident physicians from 669 medical institutions in Japan who participated in the GM-ITE(®) from January 17 to 30, 2024. The GM-ITE(®) provides a highly reliable evaluation of resident physicians’ test-based clinical knowledge. The 2023 GM-ITE(®) included 80 multiple-choice questions in four categories (Medical Interview and Professionalism, Symptomatology and Clinical Reasoning, Physical Examination and Clinical Skills, and Disease-Specific Topics), and six fields (Internal Medicine, Surgery, Pediatrics, Obstetrics and Gynecology, Psychiatry, and Emergency Medicine), with a maximum score of 80. Ten of the 80 questions were in English. We conducted a regional analysis (Japan, Non-Japan Asia, and Europe and Other) according to the local of the medical school where resident physicians received their undergraduate medical education. RESULT: The mean (standard deviation) GM-ITE(®) scores were 43.2 (6.9) in the Japan group, 40.3 (4.9) in the Non-Japan Asia group, and 43.5 (6.4) in the Europe and Other group, and no statistically significant differences were observed by region (p = 0.153). The scores of resident physicians in the three regional groups did not differ significantly in the Medical Interview and Professionalism, Symptomatology and Clinical Reasoning, and Physical Examination and Clinical Skills categories, but those in the Non-Japan Asia group scored lower in the Disease-Specific Topics category (p = 0.003). The scores of the three regional groups did not differ significantly in the Internal Medicine (p = 0.637), Pediatrics (p = 0.296), Psychiatry (p = 0.112), and Emergency Medicine (p = 0.115) fields, but the Japan group scored higher in the Surgery (p = 0.007) and Obstetrics and Gynecology (p = 0.002) fields. The Europe and Other group scored significantly higher in the questions asked in English (p = 0.010). CONCLUSION: Overall GM-ITE(®) scores showed no clear differences between groups; however, interpretation is limited by the small IMG sample, and category- or field-specific findings are exploratory. Observed differences warrant further study of educational and training background. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12909-026-08941-1.