language-icon Old Web
English
Sign In

GitHub Label Embeddings

2020 
GitHub repository issues can be “tagged” with labels to provide better understanding, organization, classification and to make information retrieval easier for both users and project managers. GitHub provides nine default labels and allows users to create, edit, and delete labels to fit the project maintainers’ management goals. Such labels can, for example, help users to find open source projects that are open for new collaborators since they are able to search for the default label good first issuein GitHub’s search engine. However, such a mechanism would be more powerful if the platform knew semantically similar customized labels and also reaches projects with them. In this study, we investigate two NBNE-based approaches and another based on Word2Vec algorithm to represent labels as embeddings (i.e., as vectors on a multidimensional space), so that semantically similar labels get closer. As a result, we found that Word2Vec is better indicated for this task, although it actually deserves further investigation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    1
    Citations
    NaN
    KQI
    []