Evaluating a LSTM Neural Network and a Word2vec Model in the Classification of Self-admitted Technical Debts and Their Types in Code Comments.

2020 
Context: Software development teams constantly opt for faster, lower quality solutions to solve current problems without planning for the future. This situation will have a negative long-term impact and is called technical debt. Similar to a financial debt, technical debts require interest payments and must be managed and detected so that the team can evaluate the best way to deal with them. One way to detect technical debts is through classification of source code comments. Developers often insert comments warning of the need to improve their own code in the future. This is known as Self-Admitted Technical Debt (SATD). Objective: Combine Word2vec for word embedding with a Long short-term memory (LSTM) neural network model to identify SATDs from comments in source code and compare with other studies and LSTM without word embedding. Method: We plan and execute an experimental process with model’s effectiveness data validation. Results: In general, the classification improves when all SATD types were grouped in a single label. In relation to other studies, the LSTM model with Word2vec achieved better recall and f-measure. The LSTM model without word embedding achieves greater recall, but perform worse in precision and f-measure. Conclusion: We found evidence that LSTM models combined with word embedding are promising for the development a more effective SATD classifier.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    0
    Citations
    NaN
    KQI
    []