Mask Scene Text Recognizer.

Haodong Shi,Liangrui Peng,Ruijie Yan,Gang Yao,Shuman Han,Shengjin Wang

Mask Scene Text Recognizer.

2021

Scene text recognition is a challenging sequence modeling problem. In this paper, a novel mask scene text recognizer (MSTR) is proposed to incorporate a supervised learning task of predicting text image mask into a CNN (convolutional neural network)-Transformer framework for scene text recognition. The incorporated mask predicting branch is connected in parallel with the CNN backbone, and the predicted mask is used as attention weights for the feature maps output by the CNN. We investigate three variants of the incorporated mask predicting branches, i.e. a) mask branch which predicts text foreground image mask; b) boundary branch which predicts boundaries of characters in the input images; c) fused mask and boundary branches with different fusion schemes. Experimental results on seven commonly used scene text recognition datasets show that our method with fused mask and boundary branches has outperformed previous state-of-the-art methods.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations