OCRNet a robust deep learning framework for alphanumeric character recognition to assist the visually impaired

OCRNet 是一个强大的深度学习框架,用于字母数字字符识别,以帮助视障人士。

阅读:1

Abstract

Optical Character Recognition (OCR) is a part of transformative Artificial Intelligence (AI) technology which translates printed or handwritten texts into digital, machine-readable form. These OCR systems act as an assistive tool for visually impaired people helping them with real-time text recognition and to interact with their surroundings. This paper presents "OCRNet", a robust deep learning approach to detect and recognize alphanumeric characters in dynamic environments. Firstly, an optimized neural network with 43 layers is designed to capture the spatial features of the 62 alphanumeric characters. A Gated Recurrent Unit (GRU) is then added to capture the temporal dependencies of these characters in order to enhance feature learning. This hybrid model outperforms state-of-the-art Convolutional Neural Networks (CNNs) like EfficientNetB7, MobileNetV2, ResNet50, DenseNet121 and others by achieving a notable accuracy of 95%, precision of 94%, recall of 95% and F1-score of 96%. In order to promote portability and affordability, this model is implemented and tested on a Raspberry Pi platform and has an inference time of 120ms. Tailored for visually impaired users, the proposed system provides real-time text recognition and audio feedback thus enabling seamless interaction with textual content in everyday scenarios like street signs, documents, and digital displays.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。