Multiscale segmentation for MRC document compression using a Markov random field model
2010
The Mixed Raster Content (MRC) standard (ITU-T T.44) specifies a framework for document compression which can dramatically improve the compression/quality tradeoff as compared to traditional lossy image compression algorithms. The key to MRC's performance is the separation of the document into foreground and background layers, represented as a binary mask. In this paper, we propose a novel multiscale segmentation scheme based on the sequential application of two algorithms. The first algorithm, Cost Optimized Segmentation (COS), is a blockwise segmentation algorithm formulated in a global cost optimization framework. The second algorithm, Connected Component Classification (CCC), refines the initial segmentation by classifying feature vectors of connected components using a Markov random field (MRF) model. The combined COS/CCC segmentation algorithms are then incorporated into a multiscale framework in order to improve the segmentation accuracy of text with varying size.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
16
References
1
Citations
NaN
KQI