Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record

2018 
The wide implementation of electronic health record (EHR) systems facilitates the collection of large-scale health data from real clinical settings. Despite the significant increase in adoption of EHR systems, these data remain largely unexplored, but present a rich data source for knowledge discovery from patient health histories in tasks, such as understanding disease correlations and predicting health outcomes. However, the heterogeneity, sparsity, noise, and bias in these data present many complex challenges. This complexity makes it difficult to translate potentially relevant information into machine learning algorithms. In this paper, we propose a computational framework, Patient2Vec , to learn an interpretable deep representation of longitudinal EHR data, which is personalized for each patient. To evaluate this approach, we apply it to the prediction of future hospitalizations using real EHR data and compare its predictive performance with baseline methods. Patient2Vec produces a vector space with meaningful structure, and it achieves an area under curve around 0.799, outperforming baseline methods. In the end, the learned feature importance can be visualized and interpreted at both the individual and population levels to bring clinical insights.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    45
    References
    99
    Citations
    NaN
    KQI
    []