Abstract
In this study, we propose EnhanceCenter, a multiple-object tracking model that demonstrates enhanced tracking efficiency and stability while reducing dependencies on computationally intensive detectors. EnhanceCenter, based on the CenterTrack method, introduces three key improvements. First, a channel-spatial-spatial feature fusion module effectively utilizes object appearance information, enhancing tracking in complex scenes. Second, the backbone network weights are optimized for multiple-object tracking tasks, enabling more effective feature extraction. Lastly, an improved association method increases long-term tracking stability, maintaining consistency during occlusions or detection failures. Experiments on various MOT benchmarks demonstrated the performance of EnhanceCenter against models using high-performance detectors. On the MOT17 test set, EnhanceCenter outperformed CenterTrack with a 1.6% improvement in IDF1 and achieved a HOTA of 55.1%, surpassing leading center-point-based tracking studies, such as TransTrack and TransCenter. The MOT20 dataset showed a significant 13% improvement in IDF1 compared to CenterTrack. This research underscores the potential of lightweight detectors in achieving state-of-the-art multiple-object tracking performance, paving the way for more efficient tracking solutions in complex environments.