Automatically Classifying Functional and Non-functional Requirements Using Supervised Machine Learning

2017 
In this paper, we take up the second RE17 data challenge: the identification of requirements types using the "Quality attributes (NFR)" dataset provided. We studied how accurately we can automatically classify requirements as functional (FR) and non-functional (NFR) in the dataset with supervised machine learning. Furthermore, we assessed how accurately we can identify various types of NFRs, in particular usability, security, operational, and performance requirements. We developed and evaluated a supervised machine learning approach employing meta-data, lexical, and syntactical features. We employed under-and over-sampling strategies to handle the imbalanced classes in the dataset and cross-validated the classifiers using precision, recall, and F1 metrics in a series of experiments based on the Support Vector Machine classifier algorithm. We achieve a precision and recall up to ~92% for automatically identifying FRs and NFRs. For the identification of specific NFRs, we achieve the highest precision and recall for security and performance NFRs with ~92% precision and ~90% recall. We discuss the most discriminating features of FRs and NFRs as well as the sampling strategies used with an additional dataset and their impact on the classification accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    57
    Citations
    NaN
    KQI
    []