WebEvo: taming web application evolution via detecting semantic structure changes

2021 
The development of Web technology and the beginning of the Big Data era have led to the development of technologies for extracting data from websites, such as information retrieval (IR) and robotic process automation (RPA) tools. As websites are constantly evolving, to prevent these tools from functioning improperly due to website evolution, it is important to monitor the changes in websites and report them to the developers and testers. Existing monitoring tools mainly use DOM-tree based techniques to detect changes in the new web pages. However, these monitoring tools incorrectly report content-based changes (i.e., web content refreshed every time a web page is retrieved) as the changes that will adversely affect the performance of the IR and RPA tools. This results in false warnings since the IR and RPA tools typically consider these changes as expected and retrieve dynamic data from them. Moreover, these monitoring tools cannot identify GUI widget evolution (e.g., moving a button), and thus cannot help the IR and RPA tools adapt to the evolved widgets (e.g., automatic repair of locators for the evolved widgets). To address the limitations of the existing monitoring tools, we propose an approach, WebEvo, that leverages historic pages to identify the DOM elements whose changes are content-based changes, which can be safely ignored when reporting changes in the new web pages. Furthermore, to identify refactoring changes that preserve semantics and appearances of GUI widgets, WebEvo adapts computer vision (CV) techniques to identify the mappings of the GUI widgets from the old web page to the new web page on an element-by-element basis. Empirical evaluations on 13 real-world websites from 9 popular categories demonstrate the superiority of WebEvo over the existing DOM-tree based detection or whole-page visual comparison in terms of both effectiveness and efficiency.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    0
    Citations
    NaN
    KQI
    []