Abstract
INTRODUCTION: Systematic video analysis of events is a widely applied method for examining the mechanisms and underlying causes of sports injuries. Yet, the use of multiple raters poses a considerable challenge, as achieving high inter-rater reliability in video-based assessments is inherently difficult. This study evaluated the inter- and intra-rater reliability of video analysis for identifying events leading to potential injuries in winter sports, focusing on snowboard cross (SBX) and ski cross (SX). METHOD: A team of four (4) raters reviewed the video footage. 644 situations were reviewed, categorized by parameters such as crash type, course trajectory, and competitor behaviour. A standardized process was established for training the raters to classify defined situations as Crash (CR), Time of no return (TNR), Rank Shift (RS), Out of balance (OOB), Contact (CT), Avoided Contact (ACT). Inter-rater reliability was assessed using Fleiss' Kappa and Cronbach's Alpha, while Cohen's Kappa was used to evaluate intra-rater reliability. RESULTS: Categories with distinct, easily identifiable outcomes, such as time of no return and crash, exhibited high inter-rater reliability. Only minor differences exist in the literal interpretation of the values for inter-rater reliability between Cronbach's Alpha and Fleiss' Kappa. Categories with more nuanced interpretation, such as out-of-balance situations and athlete contact, showed moderate reliability. In contrast, categories like avoided contact showed lower reliability values Intra-rater reliability ranges from fair to moderate across all raters. Clearly identifiable events such as CR and TNR were recognized perfectly, while the other categories show a more ambiguous pattern. CONCLUSION: This study advances the field of sports analysis by proposing a standardized methodology for video analysis in sports with high injury incidence, specifically SBX and SX. Categories with very clear definitions of situations were identified with high inter-rater reliability (CR TNR). Others were classified with moderate accuracy across raters (RS, OOB, CT), whereas some categories could not be reliably distinguished (ACT), even following structured training. The same pattern could also be observed in the intra-rater reliability. This method allows for a higher volume of cases to be reliably analysed, which could inform more robust injury prevention strategies.