Abstract
With the widespread adoption of mobile devices in daily life, efficiently capturing and digitizing documentation has emerged as a critical research question. The acquisition of documents via mobile devices is often compromised by shadow interference and geometric distortions, which degrade image quality and adversely affect both OCR accuracy and readability. To address this, we propose a novel method that utilizes control points and illumination prediction to effectively rectify distortions and eliminate shadows in captured document images. Spatial attention is employed to guide the interpolation between control points and reference points, effectively eliminating geometric distortions in the captured document images. Following geometric unwarping, an illumination correction model is applied to remove shadows and enhance surface clarity, improving both human readability and OCR accuracy. Our method demonstrates robust performance in effectively rectifying document distortions across diverse scenarios. Evaluation on the DocUNet benchmark dataset shows that our approach achieves competitive results compared with state-of-the-art techniques.