Kalman Filter-Based Fusion of LiDAR and Camera Data in Bird's Eye View for Multi-Object Tracking in Autonomous Vehicles

基于卡尔曼滤波的鸟瞰视角激光雷达和相机数据融合在自动驾驶车辆多目标跟踪中的应用

阅读:1

Abstract

Accurate multi-object tracking (MOT) is essential for autonomous vehicles, enabling them to perceive and interact with dynamic environments effectively. Single-modality 3D MOT algorithms often face limitations due to sensor constraints, resulting in unreliable tracking. Recent multi-modal approaches have improved performance but rely heavily on complex, deep-learning-based fusion techniques. In this work, we present CLF-BEVSORT, a camera-LiDAR fusion model operating in the bird's eye view (BEV) space using the SORT tracking framework. The proposed method introduces a novel association strategy that incorporates structural similarity into the cost function, enabling effective data fusion between 2D camera detections and 3D LiDAR detections for robust track recovery during short occlusions by leveraging LiDAR depth. Evaluated on the KITTI dataset, CLF-BEVSORT achieves state-of-the-art performance with a HOTA score of 77.26% for the Car class, surpassing StrongFusionMOT and DeepFusionMOT by 2.13%, with high precision (85.13%) and recall (80.45%). For the Pedestrian class, it achieves a HOTA score of 46.03%, outperforming Be-Track and StrongFusionMOT by (6.16%). Additionally, CLF-BEVSORT reduces identity switches (IDSW) by over 45% for cars compared to baselines AB3DMOT and BEVSORT, demonstrating robust, consistent tracking and setting a new benchmark for 3DMOT in autonomous driving.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。