GRU-DF: A Temporal Model with Dynamic Imputation for Missing Target Values in Longitudinal Patient Data

2020 
Temporal models are desirable in studying progressive diseases because the data are typically collected at regular time intervals. However, such clinical data often contain many missing entries, including those from the target variable that we are interested in predicting. Standard imputation techniques (e.g., linear interpolation) are inappropriate in treating missing target observations because they approximate the missing entries before the onset of model training and, thus, would inevitably lead to training a self-fulfilling model. The absence of target observations is particularly problematic for time series data where their availability at each time step is indispensable in building a temporal model. We propose a novel approach that incorporates the missing target value imputation into the training process of the Gated Recurrent Unit (GRU) model. We evaluate our new model in our motivating domain of predicting disease progression of multiple sclerosis patients using a real-world dataset of 508 subjects. The goal is to forecast patients’ disability levels based on data collected in six-month intervals. Our model demonstrates a 27.9% performance gain over a GRU model with a standard forward-fill treatment for the missing target observations. Additionally, our model displays a 21.6% advantage over a non-temporal approach for our machine learning task.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []