Abstract
Modern civil engineering structures, such as footbridges, are increasingly susceptible to vibrations induced by human activities, emphasizing the importance of accurately assessing crowd-induced loading. Developing realistic load models requires detailed insight into the underlying crowd dynamics, which in turn depend on the coordination between individuals and the spatial organization of the group. A deeper understanding of these human-human interactions is therefore essential for capturing the collective behaviour that governs crowd-induced vibrations. This paper presents a vision-based trajectory reconstruction methodology that captures individual movement trajectories in both small groups and large-scale running events. The approach integrates colour-based image segmentation for instrumented participants, deep learning-based object detection for uninstrumented crowds, and a homography-based projection method to map image coordinates to world space. The methodology is applied to empirical data from two urban running events and controlled experiments, including both stationary and dynamic camera perspectives. Results show that the framework reliably reconstructs individual trajectories under varied field conditions, applicable to both walking and running activities. The approach enables scalable monitoring of human activities and provides high-resolution spatio-temporal data for studying human-human interactions and modelling crowd dynamics. In this way, the findings highlight the potential of vision-based methods as practical, non-intrusive tools for analysing human-induced loading in both research and applied engineering contexts.