Abstract
PURPOSE: To accurately record the movements of a hand-held target together with the smooth pursuit eye movements (SPEMs) elicited with video-oculography (VOG) combined with deep learning-based object detection using a single-shot multibox detector (SSD). METHODS: The SPEMs of 11 healthy volunteers (21.3 ± 0.9 years) were recorded using VOG. The subjects fixated on a moving target that was manually moved at a distance of 1 m by the examiner. An automatic recording system was developed using SSD to predict the type and location of objects in a single image. The 400 images that were taken of one subject using a VOG scene camera were distributed into 2 groups (300 and 100) for training and validation. The testing data included 1100 images of all subjects (100 images/subject). The method achieved 75% average precision (AP75) for the relationship between the location of the fixated target (as calculated by SSD) and the position of each eye (as recorded by VOG). RESULTS: The AP75 for all subjects was 99.7% ± 0.6%. The horizontal and vertical target locations were significantly and positively correlated with each eye position in the horizontal and vertical directions (adjusted R2 ≥ 0.955, P < 0.001). CONCLUSIONS: The addition of SSD-driven recording of hand-held target positions with VOG allows for quantitative assessment of SPEMs following a target during an SPEM test. TRANSLATIONAL RELEVANCE: The combined methods of VOG and SSD can be used to detect SPEMs with greater accuracy, which can improve the outcome of clinical evaluations.