Traditional Chinese Medicine knowledge Service based on Semi-Supervised BERT-BiLSTM-CRF Model

2020 
Most of Traditional Chinese Medicine (TCM) data and ancient records exist in the form of books. The unstructured medical information is the foundation for building TCM knowledge service. The existing methods are not accurate enough to solve TCM named entity recognition and require a lot of manual labeling data. This paper proposes a semi-supervised embedded Semi-BERT-BiLSTM-CRF model. Based on the book “Diagnosis of Traditional Chinese Medicine in Traditional Chinese Medicine”, we select the physical features from the cleaned-up text information according to the characteristics of Chinese medicine classics, and then use a small amount of labeled data to train the BERT-BiLSTM-CRF model. The obtained model is used to predict unlabeled data and obtain pseudo-label data. The pseudo-label and labeled data are used as a training set for model training. Experiments show that TCM entity recognition accuracy of this method reaches 81.24%, which effectively improves the TCM entity recognition accuracy and reduces the manual labeling work. The results of this research can be applied to scenarios such as auxiliary diagnosis of TCM and expert system after subsequent improvement and transformation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []