AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models.
2021
Code retrieval is allowing software engineers to search codes through a
natural language query, which relies on both natural language processing and
software engineering techniques. There have been several attempts on code
retrieval from searching snippet codes to function codes. In this paper, we
introduce Augmented Code (AugmentedCode) retrieval which takes advantage of
existing information within the code and constructs augmented programming
language to improve the code retrieval models' performance. We curated a large
corpus of Python and showcased the the framework and the results of augmented
programming language which outperforms on CodeSearchNet and CodeBERT with a
Mean Reciprocal Rank (MRR) of 0.73 and 0.96, respectively. The outperformed
fine-tuned augmented code retrieval model is published in HuggingFace at
https://huggingface.co/Fujitsu/AugCode and a demonstration video is available
at: https://youtu.be/mnZrUTANjGs .
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
15
References
1
Citations
NaN
KQI