Examining the effectiveness of using concolic analysis to detect code clones

2015 
During the initial construction and subsequent maintenance of an application, duplication of functionality is common, whether intentional or otherwise. This replicated functionality, known as a code clone, has a diverse set of causes and can have moderate to severe adverse effects on a software project in a variety of ways. A code clone is defined as multiple code fragments that produce similar results when provided the same input. While there is an array of powerful clone detection tools, most suffer from a variety of drawbacks including, most importantly, the inability to accurately and reliably detect the more difficult clone types. This paper presents a new technique for detecting code clones based on concolic analysis, which uses a mixture of concrete and symbolic values to traverse a large and diverse portion of the source code. By performing concolic analysis on the targeted source code and then examining the holistic output for similarities, code clone candidates can be consistently identified. We found that concolic analysis was able to accurately and reliably discover all four types of code clones with an average precision of .8, recall of .91, F-score of .85 and an accuracy of .99.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    3
    Citations
    NaN
    KQI
    []