Patient Entity Recognition by Automatic EHR Context Understanding and Deep Learning

2019 
Patient entity recognition or patient entity extraction is to detect the relevant electronic health records (EHRs) across multiple data sources belonging to an identical patient, and to link the relevant data together. Patient entity recognition is a useful technology for cross-system electronic health data analysis to define commonality, to synthesize multiple data sources, and to reduce data redundancy. In this paper, we propose a deep learning solution, a sequential LSTM + Word Embedding network model (WE + LSTM) to filter and represent the non-structured electronic health records by measuring their context similarity and link them to the identical patient entities. The text context features are at first filtered by a trained bidirectional LSTM network to filter the irrelevant patient entities, and the related patient information context is estimated by a trained shallow word embedding network for its word vector similarity with the existing entities in the database. Finally, the new input patient data will be linked to the existing patient entity in the dataset with the greatest context similarity. Our hypothesis is that the records pointing to the identical patient have closest context similarity, so the patterns can be encoded by a trained word embedding network. An infection disease registration dataset (5304 patient entities) is used to evaluate the performance of the proposed WE+LSTM model. The classification accuracy is 0.837 and the F score is 0.843, which is the highest compared to the comparators including a single word embedding model, a random forest model, and a conventional neural network model. In addition, the WE + LSTM model has the greatest AUG area when the ROC of the four models are compared. This result indicates the proposed WE + LSTM model provides a feasible solution to correctly recognize the patient identities from electronic records by measure the text context similarity. It provides a solution for patient identity recognition through multi-source health big data integration, which is an urgent task for health big data projects.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    0
    Citations
    NaN
    KQI
    []