Text detection and script identification in natural scene images using deep learning

2021 
Abstract The detection of text in an image and identification of its language are important tasks in optical character recognition. Such tasks are challenging, particularly in natural scene images. Previous studies have been conducted with a focus on convolutional neural networks for script identification. In other studies, fully convolutional networks (FCNs) have been used for model enhancement and not as classifiers. In this study, we use FCNs for both model enhancement and classification. The proposed methodology improves the Efficient and Accurate Scene Text Detector by adding new FCN branches for script identification. Moreover, whereas most end-to-end (e2e) methods train the text detection and script identification models separately, we propose two e2e methods for jointly training the models, namely, multi-channel mask (MCM) and multi-channel segmentation (MCS). The results show that the performance of an MCM is similar to that of other state-of-the-art methods, whereas MCS outperforms existing methods with recall values of 54.34% and 81.13%, when using the ICDAR MLT 2017 and MLe2e datasets, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    4
    Citations
    NaN
    KQI
    []