Design and Implementation of Deep Web Customized Focused Web Crawler

2014 
How to capture data from the Internet accurately and efficiently is of utmost significance in Big Data era. In this paper we propose a customized web crawler framework,and by setting up configuration files we can construct a highly accurate and controllable focused web crawler. In addition to this,we implement the Deep Web form submitting and Deep Web data capturing based on the improvement of workflow of the focused crawler. Experiments on capturing the data from the IHEP website and mobile Tencent microblog as well as its practical performance on the big data platform of IHEP indicate the effectiveness and practicability of the crawler.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []