Straight Line Reconstruction for Fully Materialized Table Extraction in Degraded Document Images.

2019 
Tables are one of the best ways to synthesize information such as statistical results, key figures in documents. In this article we focus on the extraction of materialized tables in document images, in the particular case where acquisition noise can disrupt the recovering of the table structures. The sequential printings/scannings of a document and its deterioration can lead to “broken” lines among the materialized segments of the tables. We propose a method based on the search for straight line segments in documents, relying on a new image transform that locally defines primitives well suited for pattern recognition and on a proposed theoretical model of lines in order to confirm their presence among a set of confident potential line parts. The extracted straight line segments are then used to reconstruct the table structures. Our approach has been evaluated both from quality and stability points of view.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    0
    Citations
    NaN
    KQI
    []