RCorp: a Resource for Chemical Disease Semantic Extration in Chinese.

2019 
To identify knowledge of combination therapy for chronic diseases, high-quality manually curated corpora are required. In this study, we describe the construction work of a corpus for chemical disease semantic extraction in Chinese (RCorp) from a collection of Chinese biomedical abstracts. During the RCorp corpus construction, we used domain-existing guidelines as references, and incorporated annotation tools to the standard annotation process. The resulting corpus consists of 339 Chinese biomedical articles with 2,367 annotated chemicals, 2,113 diseases, 237 symptoms, 164 chemical-induce-disease relations, 163 chemical-induce-symptom relations, and 805 chemical-treat-disease relations. RCorp gets an inter-annotator agreement score of 0.883 for chemical entities, 0.791 for disease entities which are measured by F score. And the F score for chemical-treat-disease relations gets 0.788 after unifying the entity mentions. The result analysis of the corpus proves the quality of the corpus for the combination therapy of chronic diseases knowledge discovery task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    3
    References
    1
    Citations
    NaN
    KQI
    []