Extraction and Recognition of Handwritten Gujarati Characters and Numerals from Images Using Deep Learning

2022 
Automated recognition of Indian scripts from a document or image is a challenging problem. Gujarati is an Indian script meant for expressing the language Gujarati. Gujarati is the official language of the west Indian state of Gujarat and is spoken and written by nearly 60.3 million people. Gujarati script can be considered as more complicated for machine recognition owing to the presence of complex consonants, vowel modifiers compound character called ‘Jodakshar’ and absence of ‘Shirorekha’. The problem becomes all the more challenging when the script to be recognized is handwritten rather than printed. Two basic approaches exist in the literature on the topic: Offline as well as online handwritten Gujarati script recognition. The online method takes the input in real time, whereas offline systems work on scanned images making the recognition task more lengthy and complicated. The main aim of this paper is twofold. First, a dataset of images of handwritten Gujarati characters and numerals is made available, and second, offline Gujarati character recognition from images of handwritten Gujarati text. Hence, the primary objective of the paper is to present an approach that could extract Gujarati characters from an image and then recognize them with deep learning. The accuracy of the system obtained in most of the cases is more than 98%. The best accuracy obtained when tested with unseen character images is 82.15%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []