Gray Tunneling Based on Joint Link for Focused Crawling

2015 
Tunneling problems of the topic-multiplicity of a web page makes the relevance of the highly relevant page to be weakened. In this paper, we proposed a novel relevance prediction for focused crawling to solve gray tunneling. Our approach is based on calculating the relevancy score of web page based on its block relevancy score with respect to topics and calculating the URL score based on its parent pages and its anchor contexts, and we joins the context similarity and the link similarity which is based on Q feedback learning. Experimental results showed that the proposed method outperformed the Link-Contexts, Best-First and Breadth-First for all test data sets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    1
    Citations
    NaN
    KQI
    []