Web Crawler for Event-Driven Crawling of AJAX-Based Web Applications

2013 
This paper describes a novel technique for crawling Ajax-based applications through “event-driven” crawling in web browsers. The algorithm uses the browser context to analyse the DOM, scans the DOM-tree, detects elements that are capable of changing the state, triggers events on those elements and extracts dynamic DOM content. For illustration, an AJAX web application is utilized as an example to explain the approach. Additionally, the authors implement the concepts and algorithms discussed in this paper in a tool. Finally, the authors report a number of empirical studies in which they apply their approach to a number of representative AJAX applications. The results show that their method has a better performance often with a faster rate of state discovery. The “event-driven” crawling can effectively and accurately crawl dynamic content from Ajax-based applications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    7
    References
    4
    Citations
    NaN
    KQI
    []