Balancing Misclassification Rates in Classification-TreeModels of Software Quality

2000 
Software product and process metrics can be useful predictors of which modules are likely to have faults during operations. Developers and managers can use such predictions by software quality models to focus enhancement efforts before release. However, in practice, software quality modeling methods in the literature may not produce a useful balance between the two kinds of misclassification rates, especially when there are few faulty modules. This paper presents a practical classification rule in the context of classification tree models that allows appropriate emphasis on each type of misclassification according to the needs of the project. This is especially important when the faulty modules are rare. An industrial case study using classification trees, illustrates the tradeoffs. The trees were built using the TREEDISC algorithm which is a refinement of the CHAID algorithm. We examined two releases of a very large telecommunications system, and built models suited to two points in the development life cycle: the end of coding and the end of beta testing. Both trees had only five significant predictors, out of 28 and 42 candidates, respectively. We interpreted the structure of the classification trees, and we found the models had useful accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    39
    References
    56
    Citations
    NaN
    KQI
    []