Automatic classification of software artifacts in open-source applications

Yuzhan Ma,Sarah Fakhoury,Michael Christensen,Venera Arnaoudova,Waleed Zogaan,Mehdi Mirakhorli

Automatic classification of software artifacts in open-source applications

2018

With the increasing popularity of open-source software development, there is a tremendous growth of software artifacts that provide insight into how people build software. Researchers are always looking for large-scale and representative software artifacts to produce systematic and unbiased validation of novel and existing techniques. For example, in the domain of software requirements traceability, researchers often use software applications with multiple types of artifacts, such as requirements, system elements, verifications, or tasks to develop and evaluate their traceability analysis techniques. However, the manual identification of rich software artifacts is very labor-intensive. In this work, we first conduct a large-scale study to identify which types of software artifacts are produced by a wide variety of open-source projects at different levels of granularity. Then we propose an automated approach based on Machine Learning techniques to identify various types of software artifacts. Through a set of experiments, we report and compare the performance of these algorithms when applied to software artifacts.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations