Abstract
Optical Character Recognition (OCR) is a part of transformative Artificial Intelligence (AI) technology which translates printed or handwritten texts into digital, machine-readable form. These OCR systems act as an assistive tool for visually impaired people helping them with real-time text recognition and to interact with their surroundings. This paper presents "OCRNet", a robust deep learning approach to detect and recognize alphanumeric characters in dynamic environments. Firstly, an optimized neural network with 43 layers is designed to capture the spatial features of the 62 alphanumeric characters. A Gated Recurrent Unit (GRU) is then added to capture the temporal dependencies of these characters in order to enhance feature learning. This hybrid model outperforms state-of-the-art Convolutional Neural Networks (CNNs) like EfficientNetB7, MobileNetV2, ResNet50, DenseNet121 and others by achieving a notable accuracy of 95%, precision of 94%, recall of 95% and F1-score of 96%. In order to promote portability and affordability, this model is implemented and tested on a Raspberry Pi platform and has an inference time of 120ms. Tailored for visually impaired users, the proposed system provides real-time text recognition and audio feedback thus enabling seamless interaction with textual content in everyday scenarios like street signs, documents, and digital displays.