Unsupervised Optical-Sensor Extrinsic Calibration via Dual-Transformer Alignment

基于双变压器对准的无监督光学传感器外参校准

阅读:1

Abstract

Accurate extrinsic calibration between optical sensors, such as camera and LiDAR, is crucial for multimodal perception. Traditional methods based on specific calibration targets exhibit poor robustness in complex optical environments such as glare, reflections, or low light, and they rely on cumbersome manual operations. To address this, we propose a fully unsupervised, end-to-end calibration framework. Our approach adopts a dual-Transformer architecture: a Vision Transformer extracts semantic features from the image stream, while a Point Transformer captures the geometric structure of the 3D LiDAR point cloud. These cross-modal representations are aligned and fused through a neural network, and a regression algorithm is used to obtain the 6-DoF extrinsic transformation matrix. A multi-constraint loss function is designed to enhance structural consistency between modalities, thereby improving calibration stability and accuracy. On the KITTI benchmark, our method achieves a mean rotation error of 0.21° and a translation error of 3.31 cm; on a self-collected dataset, it attains an average reprojection error of 1.52 pixels. These results demonstrate a generalizable and robust solution for optical-sensor extrinsic calibration, enabling precise and self-sufficient perception in real-world applications.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。