Abstract
Vision sensor-based simultaneous localization and mapping (SLAM) systems are essential for mobile robots to locate and generate spatial models of their surroundings. However, the majority of visual SLAM systems assume static settings, leading to significant accuracy degradation in dynamic scenes. We present SGDO-SLAM, a real-time RGB-D semantic-aware SLAM framework, building upon ORB-SLAM2 to address non-static environments. Firstly, a multi-constraint dynamic rejection method from coarse to fine is proposed. The method starts with coarse rejection by combining semantic and geometric information, followed by detailed rejection using depth information, where static quality weights are quantified based on depth consistency constraints. The method achieves accurate dynamic scene perceptions and improves the accuracy of the system's positioning. Then, a position optimization method driven by static quality weights is proposed, which prioritizes high-quality static features to enhance pose estimation. Finally, a visualized dense point cloud map is established. We performed experimental evaluations on the TUM RGB-D dataset and the Bonn dataset. The experimental results demonstrate that SGDO-SLAM reduces the absolute trajectory error performance metrics by 95% compared to the ORB-SLAM2 algorithm, while maintaining real-time efficiency and achieving state-of-the-art accuracy in dynamic scenarios.