A Cache Language Model for Whole Document Handwriting Recognition

2014 
With increasing computational power, the trend in unconstrained text recognition is going towards whole document processing. For this task, more sophisticated language models can be employed. One approach is to take advantage the fact that the text of a document normally deals with a specific topic and hence the word occurrence probability is biased. Cache language models combine information about recent words, the cache, with a general statistical language model to increase the recognition rate. In this work we introduce a modified version of the cache language model to the task of handwriting recognition, where the N-best recognition output of the entire document is used to refine the language model for a consecutive recognition pass. An experimental evaluation on the IAM database demonstrates that the word error rate can be reduced with the proposed cache language model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []