Abstract
BACKGROUND: Artificial intelligence (AI) is increasingly applied in medical education, but its role in fostering interactive clinical competencies remains underexplored. This pilot study aimed to compare the feasibility and educational impact of an AI chatbot-based simulation with traditional peer role-play (PRP) for Objective Structured Clinical Examination (OSCE) preparation, and to share practical lessons from implementing a novel AI tool in a trial setting. METHODS: Nineteen final-year Korean medicine students were randomly assigned to either an AI chatbot group (n = 9) or a PRP group (n = 10) after a baseline knowledge test. Both groups underwent a 30-minute physical examination practice session, followed by a one-hour clinical interview training session specific to their group. The AI chatbot group practiced with a GPT-4o/Claude 3.5-based chatbot providing scenario-driven responses and automated feedback, while the PRP group practiced in pairs under tutor supervision. All participants then completed two OSCE stations (dizziness and shoulder pain). Performance was assessed using a structured checklist covering four domains: history taking, physical examination, patient education, and physician-patient interaction. Post-study questionnaires evaluated the learning experience. RESULTS: Although the differences in OSCE scores between the groups did not reach statistical significance, several interesting and complementary trends were observed. For example, the PRP group tended to score higher in history taking (mean 74.4 vs. 66.2 in dizziness scenario; Hedges' g = -0.68, mean 58.6 vs. 54.5 in shoulder pain scenario; Hedges' g = -0.21), while the AI chatbot group showed a tendency towards higher scores in patient education (32.5 vs. 22.2 in dizziness scenario Hedges' g = 0.44, 85.0 vs. 66.7 in shoulder pain scenario; Hedges' g = 0.99). Survey results reflected these following trends. The PRP group valued the authenticity of the interaction and the exam-like environment. In contrast, the AI chatbot group reported higher satisfaction with the autonomy, opportunity for repetitive practice, and structured feedback. CONCLUSION: In this pilot study, AI chatbot-based training and PRP demonstrated complementary strengths for OSCE preparation. While PRP appears effective for developing performance-based procedural and communication skills in a realistic setting, AI chatbots show potential for fostering clinical reasoning in a self-paced, reflective learning environment. These complementary strengths suggest a blended learning model, combining both methods, may be optimal for holistic clinical skills development. Further research is needed to validate these preliminary findings.