An Empirical Evaluation of Machine Learning Algorithms for Identifying Software Requirements on Stack Overflow: Initial Results

2019 
Context: The recent developments made during the last decade or two in requirements engineering (RE) methods have seen a rise in using different machine-learning (ML) algorithms to solve some complex RE problems. One such problem is identifying and classifying software requirements on Stack Overflow (SO). The suitability of ML-based techniques to this tackle problem has shown convincing results, much better than those generated by some traditional natural language processing (NLP) techniques. Nevertheless, a comprehensive and systematic comprehension of these ML based techniques is still deficient. Objective: To identify and classify the type of ML algorithms used for identifying software requirements on SO. Method: This article reports systematic literature review (SLR) gathering evidence published up to August, 2019. Results: This study identified 1073 published papers related to RE and SO. Only 12 primary papers were selected. The data extraction process revealed that; 1) Latent Dirichlet Allocation (LDA) topic modeling is the most widely used ML algorithm in the selected studies, and 2) Precision and recall are the most commonly used evaluation method to measure the performance of these ML algorithms. Conclusion: The SLR finds that while ML algorithms have great potential in the identification of RE on SO, they face some open issues that will ultimately affect their performance and practical application. The SLR calls for the collaboration between RE and ML researchers, to tackle the open issues facing the development of real-world ML systems.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    0
    Citations
    NaN
    KQI
    []