CDRnN: A high performance chemical-disease recognizer in biomedical literature

2017 
Diseases/Chemical play central roles in many areas of biomedical research and healthcare. Consequently, aggregating the disease knowledge and treatment research reports becomes an extremely critical issue, especially in rapid-growth knowledge bases (e.g., PubMed). Thus, a framework of disease/chemical named entity recognition and normalization has become increasingly important for biomedical text mining. In this work, we not only define five diversities of disease names but also develop a system for disease/chemical mention recognition and normalization in biomedical texts. Our system utilizes an order 2 conditional random fields (CRFs) model to develop a recognition system and optimize the results by customizing several post-processing, including abbreviation resolution, consistency improvement, stopwords filtering, and adjectives reorganization. After evaluation, we obtained the best performance (86.9% of F-score) on disease normalization and (89.95% of Precision) on chemical normalization. These results suggest that our system is a high-performance and state of the art recognition system for disease/chemical recognition and normalization from biomedical literature.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    34
    References
    0
    Citations
    NaN
    KQI
    []