The improved Shark Search Approach for Crawling Large-scale Web Data

2014 
Web crawling is an important approach for collecting larger-scale web data on, and keeping up with, the rapidly expanding Internet. This paper puts forward the improved shark search approach for crawling large-scale Web data based on link clustering and the technology of tunnel. In this study we focus on the classification of Web links instead of downloaded web pages to determine relevancy which can avoid local optimum of the traditional shark search algorithm. The experiments show that the improve shark search algorithm can provide the simplest alternative for conquering the issue of instantaneous page which are ranked lowly allied to the given topic at hand.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []