Webpage visual feature extraction and similarity algorithm
2020
Nowadays, the Internet is developing rapidly, and the types of websites are more abundant than in previous decades. However, people face more significant network security risks, and the losses caused by phishing websites are even more severe. Phishing websites imitate the interface of real websites. For better identifying phishing websites, this paper proposes a visual feature extraction method and a visual similarity algorithm. First, the visual feature extraction method improves the VIPS algorithm to extract the logo block and calculate the visual block's signature as the image feature of the visual block by perceptual hash technology. Then, the visual similarity algorithm uses the visual blocks' coordinates and thresholds to find a one-to-one correspondence. This method assigns weights according to the tree structure and logo. The Hamming distance of the visual features is measured to calculate the similarity of the visual blocks. It integrates the similarity of the visual blocks similarity and visual block weight to get the web pages' visual similarity. Finally, we use multiple pairs of phishing webpages and legitimate webpages to verify the feasibility of the algorithm and achieve excellent results.
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
5
References
0
Citations
NaN
KQI