Automated Extraction of Requirement Entities by Leveraging LSTM-CRF and Transfer Learning

Requirement entities, "explicit specification of concepts that define the primary function objects", play an important role in requirement analysis for software development and maintenance. It is a labor-intensive activity to extract requirement entities from textual requirements, which is typically done manually. A few existing studies propose automated methods to support key requirement concept extraction. However, they face two main challenges: lack of domain-specific natural language processing techniques and expensive labeling effort. To address the challenges, this study presents a novel approach named RENE, which employs LSTM-CRF model for requirement entity extraction and introduces the general knowledge to reduce the demands for labeled data. It consists of four phases: 1) Model construction, where RENE builds LSTM-CRF model and an isomorphic LSTM language model for transfer learning; 2) LSTM language model training, where RENE captures general knowledge and adapt to requirement context; 3) LSTM-CRF training, where RENE trains the LSTM-CRF model with the transferred layers; 4) Requirement entity extraction, where RENE applies the trained LSTM-CRF model to a new-coming requirement, and automatically extracts its requirement entities. RENE is evaluated using two methods: evaluation on historical dataset and user study. The evaluation on the historical dataset shows that RENE could achieve 79% precision, 81% recall, and 80% F1. The evaluation results from the user study also suggest that RENE could produce more accurate and comprehensive requirement entities, compared with those produced by engineers.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader