Abstract
BACKGROUND: Competency-based medical education requires structured training and valid methods for assessing technical proficiency. Virtual reality simulators offer objective, reproducible assessments but often rely on developer-defined metrics lacking transparency and robust validity evidence. This study aimed to collect validity evidence, guided by Messick’s unified validity framework, for a novel, expert-informed, immersive virtual reality simulator designed to assess proficiency among orthopedic physicians for volar plate fixation of distal radius fractures. METHODS: Twelve orthopedic residents (novices) and eleven experienced orthopedic surgeons completed one distal radius fracture fixation in the simulator. Performance was evaluated using 32 simulator metrics addressing technical proficiency, imaging accuracy, and procedural errors. Simulator metrics were compared between the two groups individually (secondary outcomes) and combined into a composite total simulator score (primary outcome). Validity evidence was structured according to Messick’s framework. The composite score’s discriminatory ability was assessed using the Contrasting Groups’ Method. RESULTS: Experienced surgeons significantly outperformed novices on nine of 32 simulator metrics, while the novices scored higher on one. After excluding this construct-irrelevant metric, the remaining 31 metrics demonstrated acceptable reliability (Cronbach’s alpha = 0.79). The total simulator score was significantly different for the two groups: novices scored − 1.6 points (CI -12.3 to 9.2), and experienced surgeons scored 61.9 points (CI 53.6 to 70.3), p < .001. A discriminatory standard for the total simulator score of 34.0 points completely discriminated between novices and experienced surgeons. CONCLUSION: This study provides comprehensive validity evidence – covering all five sources in Messick’s framework – for a newly developed, expert-informed, immersive virtual reality simulator for distal radius fracture fixation. The simulator-based assessment reliably discriminated between novices and experienced orthopedic surgeons. These findings support the simulator’s utility for objective skills assessment in distal radius fracture fixation within orthopedic residency training. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12909-026-08983-5.