Abstract
Challenges such as perspective distortion, irregular reading regions, and complex backgrounds in natural scenes hinder the accuracy and efficiency of automatic meter reading systems. Current mainstream approaches predominantly utilize object-detection-based methods without optimizing for text characteristics, while enhancements in detection robustness under complex backgrounds typically focus on data preprocessing rather than model architecture. To address these limitations, a novel end-to-end framework, i.e., EDPNet (Efficient DB and PARSeq Network), is proposed to integrate efficient boundary detection and text recognition. EDPNet comprises two key components, EDNet for detection and EPNet for recognition, where EDNet employs EfficientNetV2-s as its backbone with the Multi-Scale KeyDrop Attention (MSKA) and Efficient Multi-scale Attention (EMA) mechanisms to address perspective distortion and complex background challenges, respectively. During the recognition stage, EPNet integrates a DropKey Attention module into the PARSeq encoder, enhancing the recognition of irregular readings while effectively mitigating overfitting. Experimental evaluations show that EDNet achieves an F1-score of 0.997988, outperforming DBNet++ (ResNet50) by 0.61%. In challenging scenarios, EDPNet surpasses state-of-the-art methods by 0.7~1.9% while reducing parameters by 20.03%. EPNet achieves 90.0% recognition accuracy, exceeding the current best performance by 0.2%. The proposed framework delivers superior accuracy and robustness in challenging conditions while remaining lightweight.