Hunting Brand Domain Forgery: A Scalable Classification for Homograph Attack
2019
Visual homograph attack is a way that the attackers deceive victims about what domain they are communicating with by exploiting the fact that many characters look alike. The attack is growing into a serious problem and raising broad attention in reality when recently many brand domains have been attacked such as apple.com (Apple Inc.), adobe.com (Adobe Systems Incorporated), lloydsbank.co.uk (Lloyds Bank), etc. Therefore, how to detect visual homograph becomes a hot topic both in industry and research community. Several existing papers and tools have been proposed to find some homographs of a given domain based on different subsets of certain look-alike characters, or based on an analysis on the registered International Domain Name (IDN) database. However, we still lack a scalable and systematic approach that can detect sufficient homographs registered by attackers with a high accuracy and low false positive rate. In this paper, we construct a classification model to detect homographs and potential homographs registered by attackers using machine learning on feasible and novel features which are the visual similarity on each character and some selected information from Whois. The implementation results show that our approach can bring up to 95.90% of accuracy with merely 3.27% of false positive rate. Furthermore, we also make an empirical analysis on the collected homographs and found some interesting statistics along with concrete misbehaviors and purposes of the attackers.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
7
References
5
Citations
NaN
KQI