HCADecoder: A Hybrid CTC-Attention Decoder for Chinese Text Recognition

2021 
Text recognition has attracted much attention and achieved exciting results on several commonly used public English datasets in recent years. However, most of these well-established methods, such as connectionist temporal classification (CTC)-based methods and attention-based methods, pay less attention to challenges on the Chinese scene, especially for long text sequences. In this paper, we exploit the characteristic of Chinese word frequency distribution and propose a hybrid CTC-Attention decoder (HCADecoder) supervised with bigram mixture labels for Chinese text recognition. Specifically, we first add high-frequency bigram subwords into the original unigram labels to construct the mixture bigram label, which can shorten the decoding length. Then, in the decoding stage, the CTC module outputs a preliminary result, in which confused predictions are replaced with bigram subwords. The attention module utilizes the preliminary result and outputs the final result. Experimental results on four Chinese datasets demonstrate the effectiveness of the proposed method for Chinese text recognition, especially for long texts. Code will be made publicly available(https://github.com/lukecsq/hybrid-CTC-Attention).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []