A dataset for forgery detection and spotting in document images

Nicolas Sidère,Francisco Cruz,Mickaël Coustaty,Jean-Marc Ogier

A dataset for forgery detection and spotting in document images

2017

Nicolas Sidère
Francisco Cruz
Mickaël Coustaty
Jean-Marc Ogier

In the last decades, the explosion of the volume of digital document images, and the development of consumer tools to modify these images, has lead to a huge increase on reported fraudulent document cases. This situation has promoted the development of automatic methods for both preventing forgeries in modified documents and detecting them. However, document forensics is a sensitive topic. Data is usually either private or unlabeled, and most of the reported works are commonly evaluated on datasets with a restricted access. In this paper we present a new public dataset made of a corpus of 477 corrupted payslips in which near 6000 characters were forged. Provided with a reliable groundtruth, we expect this dataset to be useful for many works in the digital forensics research domain.

Keywords:

World Wide Web
Spotting
Digital document
Computer science
Digital forensics
restricted access
forgery detection

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations