Abstract
The rise of AI has seen an explosion in the use of deep learning methods that automate the analysis of image and video data, saving ecologists vast amounts of time and resources. Ecological imagery poses unique challenges; however, with cryptic species struggling to be detected among poor visibility and diverse environments. We propose leveraging movement information to attempt to improve the predictions produced by a high-performing object detection algorithm. Frame differencing, background subtraction, optical flow and multi-object tracking are trialed on four diverse datasets containing over 35,000 annotated images sourced from terrestrial, marine and freshwater habitats. We find that leveraging movement information is useful for smaller sized studies and rarer species, however is not needed for well annotated studies (> 400 annotations per class). Out of the methods that utilise movement, we find that a simple 'differencing' of neighbouring frames generally performed the best, whilst attempting to track taxa to boost prediction scores performed poorly. Other studies in this area tend to focus only on 1-2 datasets and a single method that utilises movement information, making it difficult for ecologists to generalise results. Our study provides key lessons for ecologists to determine whether it is useful to incorporate methods that leverage movement information when attempting to automatically predict taxa. We offer straightforward code for practical implementation via our GitHub repository, BenMaslen/MCD, along with an evaluation benchmark dataset called 'Tassie BRUV' that can be accessed from the Dryad public repository https://doi.org/10.5061/dryad.sbcc2frf7.