Finding and fighting search engine spam

2007 
Web surfers rely on search engines to find information from the web. Search engine spam is the attempt to deceive search engine ranking algorithms and is considered by experts from well-known search engine companies to be one of the major challenges to search engines today. Without taking action, results from search engines will be greatly harmed. This dissertation explores in detail effective solutions to some search engine spam techniques, such as link farms and cloaking. Our approaches can effectively nullify the link farm effect and our precision outperforms the standard link based ranking algorithm, HITS, by more than 200%. Our approach in detecting cloaking behavior can achieve an accuracy of 90.5% and a recall of 86.8%. In addition, this dissertation studies the idea of combining topicality with trust to demote spam. Experimental results show that we can demote up to 43% more spam than TrustRank. We also investigate the method of combining trust and authority to improve search quality. Experimental results indicate that the approaches in this dissertation can significantly improve search quality and demote spam.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    108
    References
    2
    Citations
    NaN
    KQI
    []