Multi-task Learning for Attribute Extraction from Unstructured Electronic Medical Records

2019 
Electronic medical records have been widely used in hospitals to store patient information in a digital format, which is convenient to reuse the patient’s medical data and make it become the data of teaching and scientific research. It is also convenient to analyze and mine the patient’s data, so as to provide the basis for medical research. However, most of the existing methods are based on structured data of electronic medical records, and researches on unstructured texts are very rare, which would lose a lot of important information. In this paper, we focus on attribute extraction from the unstructured text of electronic medical records, and propose a multi-task learning model to jointly learn related tasks to help improve the generalization performance of all the tasks. Specifically, we use an end-to-end neural network model to extract different attribute values from the same unstructured text. We take each sentence/segment of the text as an instance. For each instance, we first use the pre-trained word embedding to better initialize our neural network models, then we fine-tune them by using our domain corpus to capture domain specific semantics/knowledge. Considering that the importance of different instances for attribute extractors is not equal, we also use an attention mechanism to select the most important instances for those attribute extractors. Finally, our model use multi-task learning by solving multiple multi-class classification problems simultaneously. Experimental results show the effectiveness of our method.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    2
    Citations
    NaN
    KQI
    []