Automated Honey Document Generation Using Genetic Algorithm

2021 
Sensitive data exfiltration attack is one of predominant threats to cybersecurity. The honey document is a type of cyber deception technology to address this issue. Most existing works focus on the honey document deployment or bait design, ignoring the importance of the document contents. Believable and enticing honey contents are the foundation for achieving attacker deception, attack discovery, and sensitive data protection. This paper presents a method for automating the generation of honey document contents by measuring believability and enticement. We use real documents as materials, replace sensitive information with insensitive parts of other documents to generate honey contents. A genetic algorithm (GA) is deployed to achieve automatic multiobjective optimization of the generation process. Our method allows generating a set of diverse honey documents from one origin. The attackers have to wade through plenty of documents with the same topics and similar contents in detail to distinguish them, thus hindering the exfiltration attack. We conducted numerical and manual experiments with both Chinese and English documents, where the results validate the effectiveness.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []