A new internal metric for software clustering algorithms validity

2020 
Clustering (modularisation) techniques are often employed for the meaningful decomposition of a program aiming to understand it. In the software clustering context, several external metrics are presented to evaluate and validate the resultant clustering obtained by an algorithm. These metrics use a ground-truth decomposition to evaluate a resultant clustering. When there exists no ground-truth decomposition for a software system, internal metrics are utilised to validate clustering algorithms. Due to the comparison with a reference decomposition, external metrics are preferred to internal metrics. Available internal metrics used to measure the clustering quality are not appropriate for evaluating because they do not consider the purpose of software clustering, which is to understand a software system. In this study, the authors present six criteria that influence the understanding of a program. Then the authors design an internal metric for estimating the software clustering quality considering those criteria. They selected ten folders of Mozilla Firefox with different sizes and functionalities to assess the reliability of the proposed metric. The experimental results confirm that the proposed internal metric is more accurate than the existing internal metrics in terms of proximity to expert decomposition. The proposed internal metric can be a substitute for external metrics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    1
    Citations
    NaN
    KQI
    []