Analysis of Longitudinal Data with Unmeasured Confounders

1991 
SUMMARY Confounding in longitudinal or clustered data creates special problems and opportunities because the relationship between the confounder and covariate of interest may differ across and within individuals or clusters. A well-known example of such confounding in longitudinal data is the presence of cohort and period effects in models of aging in epidemiologic research. We first formulate a data-generating model with confounding and derive the distribution of the response variable unconditional on the confounder. We then examine the properties of the regression coefficient for some analytic approaches when the confounder is omitted from the fitted model. The expected value of the regression coefficient differs in across- and within-individual regression. In the multivariate case, within- and betweenindividual information is combined and weighted according to the assumed covariance structure. We assume compound symmetry in the fitted covariance matrix and derive the variance, bias, and mean squared error of the slope estimate as a function of the fitted within-individual correlation. We find that even in this simplest multivariate case, the trade-off between bias and variance depends on a large number of parameters. It is generally preferable to fit correlations somewhat above the true correlation to minimize the effect of between-individual confounders or cohort effects. Period effects can lead to situations where it is advantageous to fit correlations that are below the true correlation. The results highlight the trade-offs inherent in the choice of method for analysis of longitudinal data, and show that an appropriate choice can be made only after determining whether within- or betweenindividual confounding is the major concern. The investigation and control of confounding is a major emphasis in the analysis of observational studies. We adapt the definition of confounding of Kleinbaum, Kupper, and Morgenstern, (1982, p. 244). A confounder is then defined as a factor, the control of which changes the relationship between the primary factor under study and the outcome. In a multiple regression situation this would occur when the potential confounder (z) is correlated with both the outcome (y) and the factor under investigation (x). We are concerned with the special case of regression model misspecification that occurs when a confounder is unmeasured or otherwise omitted from the model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    48
    Citations
    NaN
    KQI
    []