A visual explanation of Simpson’s Paradox

Sep 19, 2013

When you look for overall trends, you often poke around the data in aggregate, but when you zoom out too far, you could miss details or within-category variation. Sometimes when you zoom in, you see a completely opposite trend of what you saw overall. This is known as Simpson’s Paradox. Lewis Lehe and Victor Powell explain in a series of small, interactive charts.

Why does this matter?

Simpson’s paradox usually fools us on tests of performance. In a famous example, researchers concluded that a newer treatment for kidney stones was more effective than traditional surgery, but it was later revealed that the newer treatment was more often being used on small kidney stones. More recently, on elementary school tests, minority students in Texas outperform their peers in Wisconsin, but Texas has so many minority students that Wisconsin beats it in state rankings. It would be a shame if Simpson’s paradox led doctors to prescribe ineffective treatments or Texas schools to waste money copying Wisconsin.

The takeaway lesson: Remember to look at the details. [Thanks, Victor]

Favorites

How You Will Die

So far we’ve seen when you will die and how other people tend to die. Now let’s put the two together to see how and when you will die, given your sex, race, and age.

Life expectancy changes

The data goes back to 1960 and up to the most current estimates for 2009. Each line represents a country.

Visualizing the Uncertainty in Data

Data is an abstraction, and it’s impossible to encapsulate everything it represents in real life. So there is uncertainty. Here are ways to visualize the uncertainty.

Where Bars Outnumber Grocery Stores

A closer look at the age old question of where there are more bars than grocery stores, and vice versa.