language-icon Old Web
English
Sign In

Simpson's paradox

Simpson's paradox (or Simpson's reversal, Yule–Simpson effect, amalgamation paradox, or reversal paradox) is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. Simpson's paradox (or Simpson's reversal, Yule–Simpson effect, amalgamation paradox, or reversal paradox) is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. This result is often encountered in social-science and medical-science statistics and is particularly problematic when frequency data is unduly given causal interpretations. The paradox can be resolved when causal relations are appropriately taken care of in the statistical modeling. It has been used to try to inform the non-specialist or public audience about the kind of misleading results mis-applied statistics can generate. Martin Gardner wrote a popular account of Simpson's paradox in his March 1976 Mathematical Games column in Scientific American. Edward H. Simpson first described this phenomenon in a technical paper in 1951, but the statisticians Karl Pearson et al., in 1899, and Udny Yule, in 1903, had mentioned similar effects earlier. The name Simpson's paradox was introduced by Colin R. Blyth in 1972. One of the best-known examples of Simpson's paradox is a study of gender bias among graduate school admissions to University of California, Berkeley. The admission figures for the fall of 1973 showed that men applying were more likely than women to be admitted, and the difference was so large that it was unlikely to be due to chance. But when examining the individual departments, it appeared that six out of 85 departments were significantly biased against men, whereas only four were significantly biased against women. In fact, the pooled and corrected data showed a 'small but statistically significant bias in favor of women'. The data from the six largest departments are listed below, the top two departments by number of applicants for each gender italicised. The research paper by Bickel et al. concluded that women tended to apply to competitive departments with low rates of admission even among qualified applicants (such as in the English Department), whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants (such as in engineering and chemistry). This is a real-life example from a medical study comparing the success rates of two treatments for kidney stones. The table below shows the success rates and numbers of treatments for treatments involving both small and large kidney stones, where Treatment A includes all open surgical procedures and Treatment B is percutaneous nephrolithotomy (which involves only a small puncture). The numbers in parentheses indicate the number of success cases over the total size of the group. The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B is more effective when considering both sizes at the same time. In this example, the 'lurking' variable (or confounding variable) is the severity of the case (represented by the doctors' treatment decision trend of favoring B for less severe cases), which was not previously known to be important until its effects were included.

[ "Statistics", "Machine learning", "Mathematical economics", "Econometrics" ]
Parent Topic
Child Topic
    No Parent Topic