Enabling Text-Line Segmentation in Run-Length Encoded Handwritten Document Image Using Entropy-Driven Incremental Learning

2020 
In today’s digital era, archival and transmission of document images are generally carried out in a compressed form in order to avoid wastage of storage space and bandwidth. In the case of CCITT Group 3 and Group 4, the compressed representation is a stream of white and black pixel intensity values called runs, correspondingly indicating background and foreground regions of the document image. In this research paper, we propose a novel entropy-driven incremental learning technique that directly works on the compressed stream of runs, and subsequently facilitates text-line segmentation in handwritten document images using entropy and connected component analysis. Spatial Entropy Quantifier (SEQ) is extracted from the stream of runs based on a suitable window. Further, incremental entropy and connected component analysis are carried out thus separating text and non-text regions leading to automatic text-line segmentation. The proposed method is validated with the compressed dataset of handwritten document images and performance is reported.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    2
    Citations
    NaN
    KQI
    []