A New Code Change Prediction Dataset: A Case Study Based on HEP Software

2020 
Predicting changes proneness in software modules is an open area of research. This activity implies dealing with code changes datasets that are typically either incomplete or absent. To obtain a change dataset properly constructed, a new dictionary of software changes terms has been defined by leveraging our experience with the High Energy Physics (HEP) software. Our new dictionary includes various terms that classify a “code change” like warning, fixed bug, minor fix and optimization. Each term has been opportunely used to label each software module analyzed. The derived categories range from code development to performance improvements and refer to single pieces of the considered software. The resulting code-change dataset has been used to build a prediction model able to monitor software evolution and assess its maintainability over time. The present article gives details of the designed procedure that has been followed and presents the obtained results. The designed dictionary can be used with other, non-HEP, software as long as researchers can rely on well documented code changes. In such respect, our prediction model can be tested against these new datasets in order to improve both reliability and performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []