A dataset for forgery detection and spotting in document images

2017 
In the last decades, the explosion of the volume of digital document images, and the development of consumer tools to modify these images, has lead to a huge increase on reported fraudulent document cases. This situation has promoted the development of automatic methods for both preventing forgeries in modified documents and detecting them. However, document forensics is a sensitive topic. Data is usually either private or unlabeled, and most of the reported works are commonly evaluated on datasets with a restricted access. In this paper we present a new public dataset made of a corpus of 477 corrupted payslips in which near 6000 characters were forged. Provided with a reliable groundtruth, we expect this dataset to be useful for many works in the digital forensics research domain.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    7
    Citations
    NaN
    KQI
    []