On the effectiveness of testing sentiment analysis systems with metamorphic testing

2022 
Metamorphic testing (MT) has been successfully applied to a wide scope of software systems. In these applications, the testing results of MT form the basis for drawing conclusions about the target system’s performance. Therefore, the effectiveness of MT is crucial to the trustfulness of the derived conclusions.However, due to the nature of MT, its effectiveness can be affected by various factors. Despite of MT’s success, it is still important to study its effectiveness under different application contexts.To investigate the effectiveness of MT, we focus on an important aspect, namely, false satisfactions (which are satisfactions of metamorphic relations that involve at least one failing execution), and revisit the application of MT to (SA) systems. An in-depth analysis of the essence of false satisfactions reveals the situations where they would occur, and how they would affect the effectiveness of MT. Furthermore, 20 metamorphic relations (MRs) are identified for supporting a user-oriented evaluation of SA systems.The occurrence rates of false satisfactions are reported with respect to four SA systems. For the majority of MRs, false satisfactions account for about 20% to 50% of all MR satisfactions, suggesting that false satisfactions occur quite frequently in the evaluation of SA systems. It is also demonstrated that such high occurrence rates of false satisfactions adversely affect the users’ selection of SA systems.Our analysis reveals that without considering the occurrence of false satisfactions, MT may overestimate the system’s conformance to the relevant MR. Furthermore, our experiments empirically show that conclusions derived from MT can be adversely affected when there are many false satisfactions. Our findings will help the MT community to adopt a more fair and reliable way of using the test outcomes of MT, and can also inspire the development of solid foundations for MT.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []