A First Step Towards NLP from Digitized Manuscripts: Virtual Restoration

2018 
Digitization of the documental heritage conserved in libraries and archives is a common practice, in order to ensure the preservation and fruition of this extended part of the human cultural and historical patrimony. For the most precious, fragile and difficult to read and decipher manuscripts, specialized though portable digitization equipment, such as high resolution multispectral/hyperspectral cameras, is nowadays available. Digitization made it possible the increasingly extensive use of digital image processing techniques, to perform a number of virtual restoration tasks, which constitute a first, often necessary step prior subsequent automatic analysis of the writing contents, with the ultimate goal to perform automatic transcription and/or natural language processing tasks. Here we report our experience in this field, referring, as a case study, to the problem of removing one of the most frequent and impairing degradation affecting many ancient manuscripts, i.e., the bleed-through distortion. In this case, virtual restoration gives also the immediate benefit to facilitate the work of philologists and paleographers interested in examining and transcribing the manuscript in a traditional way.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    1
    Citations
    NaN
    KQI
    []