Abstract
BACKGROUND: Endotracheal intubation (ETI) is an emergency procedure performed in civilians and combat casualty care settings to establish an airway. It's crucial that healthcare personnel are proficient in these skills, which traditionally have been evaluated through direct feedback from experts. Unfortunately, this method can be inconsistent and subjective, requiring considerable time and resources. METHODS: This study introduces a system for assessing ETI skills using video analysis. The system employs advanced video processing techniques, including a 2D convolutional autoencoder (AE) based on a self-supervision model, capable of recognizing complex patterns in videos. A 1D convolutional model enhanced with a cross-view attention module then uses AE features to make assessments. Data for the study was gathered in two phases, focusing first on comparisons between experts and novices, and then examining how novices perform under time constraints with outcomes labeled as either successful or unsuccessful. A separate set of data using videos from head-mounted cameras was also analyzed. RESULTS: The system successfully distinguishes between experts and novices in initial trials and demonstrates high accuracy in further classifications, including under time pressure and using head-mounted camera footage. CONCLUSIONS: This system's ability to accurately differentiate between experts and novices instills confidence in its effectiveness and potential to improve training and certification processes for healthcare providers.