Abstract
Recent advances in computer vision have enabled the development of automated animal behavior observation tools. Several software packages currently exist for concurrently tracking pose in multiple animals; however, existing tools still face challenges in maintaining animal identities across frames and can demand extensive human oversight and editing. Here we report on DIPLOMAT, a Deep learning-based, Identity-Preserving, Labeled-Object Multi-Animal Tracker, which implements automated algorithms to improve identity continuity, supplemented by an efficient human interface to help eliminate remaining errors. DIPLOMAT is designed to perform multi-animal tracking by building on the per-frame pose prediction models of two state-of-the-art tools, DeepLabCut and SLEAP, applying novel methods to tolerate occlusion and preserve animal identity across frames. Notable features include leveraging model-derived positional probabilities to compute independent maximum probability traces across frames of a video, use of video-specific skeletal constraints, and implementation of an efficient user interface for resolving errors. On the MABe mouse tracking benchmark, automated tracking with DIPLOMAT reduces body identity swaps by >75%, while remaining errors are easily eradicated with manual correction.