Abstract
Accurate visual localization in complex indoor environments remains a significant challenge due to feature degradation and cumulative errors. In response, we propose PLPM-SLAM, a novel RGB-D SLAM framework that integrates orthogonal Manhattan plane constraints with point-line-plane joint optimization to enhance both robustness and accuracy. Unlike traditional approaches that decouple only the rotation matrix, PLPM-SLAM utilizes three mutually orthogonal planes to jointly decouple both rotation and translation, effectively mitigating global drift. To address scenarios lacking a complete Manhattan structure, we introduce a virtual plane construction strategy based on heterogeneous feature associations. Additionally, PLPM-SLAM incorporates both homogeneous (point-point, line-line, plane-plane) and heterogeneous (point-line, point-plane, line-plane) geometric constraints throughout the tracking and optimization processes. In unstructured environments, a vanishing-point-guided joint optimization model is employed to improve geometric consistency. Extensive evaluations on public datasets (TUM, ICL-NUIM) and real-world sequences demonstrate that PLPM-SLAM consistently outperforms ORB-SLAM3 in both structured and low-texture settings. Specifically, PLPM-SLAM achieves RMSE reductions of up to 82.77% and 92.16% on the public and real-world datasets, respectively.