Enabling Text-Line Segmentation in Run-Length Encoded Handwritten Document Image Using Entropy-Driven Incremental Learning
2020
In today’s digital era, archival and transmission of document images are generally carried out in a compressed form in order to avoid wastage of storage space and bandwidth. In the case of CCITT Group 3 and Group 4, the compressed representation is a stream of white and black pixel intensity values called runs, correspondingly indicating background and foreground regions of the document image. In this research paper, we propose a novel entropy-driven incremental learning technique that directly works on the compressed stream of runs, and subsequently facilitates text-line segmentation in handwritten document images using entropy and connected component analysis. Spatial Entropy Quantifier (SEQ) is extracted from the stream of runs based on a suitable window. Further, incremental entropy and connected component analysis are carried out thus separating text and non-text regions leading to automatic text-line segmentation. The proposed method is validated with the compressed dataset of handwritten document images and performance is reported.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
2
Citations
NaN
KQI