A Pipeline for Automating Labeling to Prediction in Classification of NFRs

2021 
Non-Functional Requirements (NFRs) focus on the operational constraints of the software system. Early detection of NFRs enables their incorporation into the architectural design at an initial stage, a practice obviously preferable to expensive refactoring at a later stage. Automated identification and classification of NFRs has therefore seen numerous efforts using rule-based, machine learning and deep learning-based approaches. One of the major challenges for such an automation is the manual effort that needs to be invested into labeling of training data. This is a concern for large software vendors who typically work on a variety of applications in diverse domains. We address this challenge by designing a pipeline that facilitates classification of NFRs using only a limited amount (~ 20% of an available new dataset) of labeled data for training. We (1) employed Snorkel to automatically label a dataset comprising NFRs from various Software Requirement Specification documents, (2) trained several classifiers using it, and (3) reused these pre-trained classifiers using a Transfer Learning approach to classify NFRs in industry-specific datasets. From among the various language model classifiers, the best results have been obtained for a BERT based classifier fine-tuned to learn the linguistic intricacies of three different domain-specific datasets from real-life projects.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []