Same summary statistics, completely different plots

Posted to Statistics  |  Tags: , ,  |  Nathan Yau

Summary statistics such as mean, median, and mode can only tell you so much about a dataset. Their scope is limited because for them to be useful, you have to assume things like distribution and dependencies. Visualization helps you see what else there is.

Justin Matejka and George Fitzmaurice demonstrate in their paper for the ACM SIGCHI Conference, in which they developed a method to generate datasets that “are identical over a range of statistical properties, yet produce dissimilar graphics.

Favorites

Life expectancy changes

The data goes back to 1960 and up to the most current estimates for 2009. Each line represents a country.

Top Brewery Road Trip, Routed Algorithmically

There are a lot of great craft breweries in the United States, but there is only so much time. This is the computed best way to get to the top rated breweries and how to maximize the beer tasting experience. Every journey begins with a single sip.

Real Chart Rules to Follow

There are rules—usually for specific chart types meant to be read in a specific way—that you shouldn’t break. When they are, everyone loses. This is that small handful.

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.