Abstract
Traditional Visual SLAM systems, like ORB-SLAM3, often lose accuracy in dynamic environments. This work presents YOLO11-ORB-SLAM3, an enhancement to ORB-SLAM3 for dynamic scenarios, which integrates a YOLO11-based instance segmentation module to detect and exclude dynamic features from the tracking process. The system is designed to work with stereo and RGB-D cameras, and its performance was evaluated on challenging dynamic sequences of the public TUM RGB-D dataset, and also through real-world experiments on a mobile robot using a stereo camera to highlight its robustness and viability for real robotic applications. Experimental results demonstrate that the proposed system outperforms the original ORB-SLAM3, reducing the error by 93% in the public TUM dataset while preserving computational efficiency.