A hosting service of multi-language historage repositories

2016 
In the research of Mining Software Repositories, source code repositories are one of the core sources since it contains the product and the process of software development. A source code repository stores the versions of files and makes it possible to browse the histories of files, such as modification dates, authors, messages, so on. Although such rich information of file histories is easily available, extracting the histories of methods/functions, which are elements of source code files, is not easy from general code repositories. To tackle this difficulty, we have developed Historage, a fine-grained version control system. Historage repository is a Git repository, which is built upon an original Git repository. Therefore, similar mining techniques for general Git repositories are applicable to Historage repositories. We also have developed Kataribe, a hosting service of Historage repositories, which contains hundreds of Historage repositories constructed from repositories in GitHub, which are written in C#, Java, Python and Ruby. The list of all Historage and original repositories are available at http://kataribe.naist.jp/public. With this dataset, we will promote in-depth and fine-grained software evolution research with diversity of programming languages.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    30
    References
    2
    Citations
    NaN
    KQI
    []