Abstract
This paper presents an improved approach for scene-aware camera relocalization using RGB images and poses. Building upon the ACE network, we proposed a refined head structure that integrates skip and dense connections alongside channel attention mechanisms. Additionally, we introduced modifications to the loss function and pose solver, leveraging SQPnP and iterative optimization. These enhancements led to significant improvements in the localization accuracy and speed, as evidenced by our experiments on the 7scenes, 12scenes, and wayspots datasets. Here, we show that the average localization errors were reduced by up to 30% and the computational times were cut by approximately 10% compared to the original ACE network, demonstrating the practicality and robustness of our approach.