On the causal interpretation of race in regressions adjusting for confounding and mediating variables

2014 
In observational research to understand health disparities, race/ethnicity is often put in a regression model, and the coefficient estimates are not infrequently interpreted as some measure of health disparity.1–3 Numerous other sociodemographic, economic, biologic or psychosocial variables are typically included in these regressions. Some of these variables may be thought of as potentially on the pathway between race/ethnicity and the health outcome. Other variables may be strongly associated with, but seemingly in no sense “caused by,” race/ethnicity. The regression coefficient for race/ethnicity is often interpreted as a “health disparity,” irrespective of the other variables for which control has been made. However, as we will argue in this paper, the interpretation of regression coefficients depends critically on issues of temporal ordering and causality. There have been numerous discussions of approaches to defining the “causal effects of race.”4–9 Some of these focus on specific settings in which “race” itself can be defined as, say, the race perceived on a job application, which can be hypothetically manipulated. In this paper we offer a tentative proposal with regard to the general interpretation of a race/ethnicity variable in a regression analysis and how this might vary given the other variables for which control has been made. What we propose certainly does not capture all of the subtleties of race/ethnicity in health disparities research, but we hope it can encourage more careful thought in what to include regarding regression models that involve race. Part of the challenge of interpreting race coefficients causally is that, in the formal causal-inference literature, effects are often defined in terms of counterfactual or potential outcomes, which are in turn defined as the outcomes that would result under hypothetical interventions.10–23 There are, however, no reasonable hypothetical interventions on race when race itself is the exposure. Here we attempt to provide a causal interpretation of race coefficients in regressions without defining potential outcomes for race itself. When adjustment is made for socioeconomic status early in a person’s life, we will see that the race coefficient can sometimes be interpreted as corresponding to the extent to which a racial inequality would remain if various socioeconomic distributions early in life across racial groups could be equalized. When adjustment is also made for adult socioeconomic status, the overall racial inequality can be decomposed into the portion that would be eliminated by equalizing adult socioeconomic status across racial groups and the portion of the inequality that would remain even if adult socioeconomic status across racial groups were equalized. Essentially, we give a plausible causal interpretation of the race coefficient by considering how much a racial inequality could be eliminated by intervening on a different variable, namely socioeconomic status, which is more manipulable than race. We discuss the possibility of stronger interpretations of race coefficients in regression models, and the challenges with doing so. The elimination of health disparities is one of the U.S. federal government’s leading health objectives.24 Persistently poorer health outcomes for some population groups may indicate violations of U.S. norms of equality of opportunity and individual dignity.25 Health disparities also limit the economic productivity and well-being of the nation.25 Understanding the causes of such disparities is central to their being addressed and we hope that the methodological approach in this paper might contribute to that end.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    236
    Citations
    NaN
    KQI
    []