Textual restoration of occluded Tibetan document pages based on side-enhanced U-Net

Siqi Liu,Libiao Jin,Fang Miao

Textual restoration of occluded Tibetan document pages based on side-enhanced U-Net

2020

It is very challenging to recognize the information of occluded Tibetan document pages due to the lack of digitization and their long-term storage. Multiple pages are stuck, and textual characters are occluded with each other, which causes difficulties in restoration. Due to the large size of Tibetan documents, it is impossible to separate and repair these occluded pages by professionals. Therefore, the separation of overlapping pages and restoration of occluded pages play important roles in the digitization of Tibetan documents. We extract underlying pages by show-through scanning and eliminating the text area of top pages. In order to restore the occluded underlying pages, we present a side-enhanced U-Net (SEU-Net) that attaches side feature extraction module and side classification module to the U-Net to improve the classification of textual edges. Experiments performed on the dataset of Tibetan documents restoration patches show that SEU-Net is able to classify the textual pixels in the occluded pages accurately, and both side feature extraction module and side classification module improve performance independently.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations