A finite mixture model approach to regression under covariate misclassification

2016 
This paper considers the problem of mismeasured categorical covariates in the context of regression modeling; if unaccounted for, such misclassification is known to result in misestimation of model parameters. Here, we exploit the fact that explicitly modeling covariate misclassification leads to a mixture representation. Assuming common parametric families for the mixture components, and assuming that the misclassification occurrence is independent of the response variable, the mixture representation permits model parameters to be identified even when misclassification probabilities are unknown. Previous approaches to covariate misclassification use multiple surrogate covariates and/or validation data on the magnitude of errors. Based on this mixture structure, we demonstrate that valid inference can be performed on all the parameters even when no such additional information is available. Using Bayesian inference, the method allows for learning from data combined with external information on the magnitude of errors when such information does become available. The method is applied to adjust for misclassification on self-reported cocaine use in the Longitudinal Studies of HIV-Associated Lung Infections and Complications (Lung HIV). We find a substantial and statistically significant effect of cocaine use on pulmonary complications measured by the relative area of emphysema, whereas a regression that does not adjust for misclassification yields a much smaller estimate.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []