Language Model Supervision for Handwriting Recognition Model Adaptation

Christopher Tensmeyer,Curtis Wigington,Brian Davis,Seth Stewart,Tony R. Martinez,William A. Barrett

Language Model Supervision for Handwriting Recognition Model Adaptation

2018

Christopher Tensmeyer
Curtis Wigington
Brian Davis
Seth Stewart
Tony R. Martinez
William A. Barrett

Not all languages and domains of handwriting have large labeled datasets available for training handwriting recognition (HWR) models. One way to address this problem is to leverage high resource languages to help train models for low resource languages. In this work, we adapt HWR models trained on a source language to a target language that uses the same writing script. We do so using only labeled data in the source language, unlabeled data in the target language, and a language model in the target language. The language model is used to produce target transcriptions to allow regular example based training. Using this approach we demonstrate improved transferability among French, English, and Spanish languages using both historical and modern handwriting datasets.

Keywords:

Machine learning
Artificial intelligence
Handwriting
Transcription (linguistics)
Labeled data
Language model
Transferability
Transfer of learning
Computer science
Handwriting recognition
Natural language processing
low resource

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations