Many-To-Many Chinese ICD-9 Terminology Standardization Based on Neural Networks.

2021 
The ICD-9 terminology standardization aims to standardize colloquial medical terminologies into ICD-9 standard terminologies. Due to the arbitrariness of natural language, these colloquial medical terminologies may be original terminologies or original terminologies combinations and correspond to standard terminologies or standard terminology combinations. Therefore, the ICD-9 terminology standardization task needs to solve the mapping of multiple original terminologies to multiple standard terminologies (namely the problem of many-to-many standardization). In this problem, when the top N (i.e. Top-N) ICD-9 terminologies with the highest probability are output as standard terminologies by the ICD-9 terminology standardization method based on BERT sorting, the output will be affected by the proportion of original terminologies in the original terminology combination. Due to the influence of different proportions, it is possible that multiple ICD-9 terminologies contained in Top-N are derived from a certain original terminology in the original terminology combination, resulting in a significant decline in the prediction effect. Therefore, this paper proposes a method for the standardization of many-to-many Chinese ICD-9 terminology based on neural networks: 1) The original terminology combination split method based on named entity recognition and part-of-speech tagging; 2) The candidate terminology set construction and terminology standardization method based on N-gram and BERT. In order to better evaluate the effect of the method, based on the CHIP public dataset and a real-world electronic medical record, this paper constructs two many-to-many datasets, named as CHIP-MTM and SCD. The experimental results show that the method in this paper has achieved an accuracy improvement of 7.7% on both CHIP-MTM and SCD.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []