Multiple imputation in a longitudinal context: A simulation study using the TREE data

2012 
Context Missing data occur in almost all surveys and they are a clear limitation to the proper analysis of such studies. This is especially true in the context of longitudinal surveys where we want to follow the trajectories of individual subjects through time. It is now a standard practice to replace missing data with imputed values, but there is still much uncertainty about the best approach to do so. The so-called chained equations method is one of the most flexible of these approaches. Our objective was to evaluate the usefulness of this particular method on real data. Method Starting from a Swiss longitudinal survey, the Transitions from Education to Employment (TREE) survey, we selected a subsample without any missing data and we randomly generated then either 10% or 20% of missing data on selected variables. Different strategies using multiple imputation and the chained equations method were then tested for the replacement of missing data with likely values. Results were then evaluated at the aggregated level by comparing distribution of probabilities computed on the original and imputed data. The impact of imputation on causality between waves was also assessed. A second set of experiments considered the missing data really existing in the original TREE data. Results The chained equations method is very well suited for the imputation of missing data in a longitudinal context. Results are very stable from one simulation to another, and no systematic bias did appear. The critical point of the method lies in the proper choice of the covariates to be used during the imputation process. In particular, it is essential to strictly respect the temporal order of the different waves of the survey and not to attempt to use covariates from future waves to impute the present. Otherwise, the causality link between waves will be clearly affected by the imputation process.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    1
    Citations
    NaN
    KQI
    []