Multi-granularity Deep Local Representations for Irregular Scene Text Recognition

Hongchao Gao,Yujia Li,Jiao Dai,Xi Wang,Jizhong Han,Ruixuan Li

Multi-granularity Deep Local Representations for Irregular Scene Text Recognition

2021

Recognizing irregular text from natural scene images is challenging due to the unconstrained appearance of text, such as curvature, orientation, and distortion. Recent recognition networks regard this task as a text sequence labeling problem and most networks capture the sequence only from a single-granularity visual representation, which to some extent limits the performance of recognition. In this article, we propose a hierarchical attention network to capture multi-granularity deep local representations for recognizing irregular scene text. It consists of several hierarchical attention blocks, and each block contains a Local Visual Representation Module (LVRM) and a Decoder Module (DM). Based on the hierarchical attention network, we propose a scene text recognition network. The extensive experiments show that our proposed network achieves the state-of-the-art performance on several benchmark datasets including IIIT-5K, SVT, CUTE, SVT-Perspective, and ICDAR datasets under shorter training time.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations