Ocr Quality Improvement Using Image Preprocessing

2016 
Optical character recognition (OCR) remains a difficult problem for noisy documents or documents scanned at low resolution. Many current approaches rely on stored font models that are vulnerable to cases in which the document is noisy or is written in a font dissimilar to the stored fonts. In this paper we test two approaches for preprocessing, or correcting the input images. The focus is on noise reduction, lightness correction and binarization, all relative to found letters with a slow but more accurate method and a fast and less accurate method. We then compare the results and see if the extra time spent in developing more complex letter deduction technique offers significant improvements.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    0
    Citations
    NaN
    KQI
    []