Same summary statistics, completely different plots

Posted to Statistics  |  Tags: , ,  |  Nathan Yau

Summary statistics such as mean, median, and mode can only tell you so much about a dataset. Their scope is limited because for them to be useful, you have to assume things like distribution and dependencies. Visualization helps you see what else there is.

Justin Matejka and George Fitzmaurice demonstrate in their paper for the ACM SIGCHI Conference, in which they developed a method to generate datasets that “are identical over a range of statistical properties, yet produce dissimilar graphics.

Favorites

Jobs Charted by State and Salary

Jobs and pay can vary a lot depending on where you live, based on 2013 data from the Bureau of Labor Statistics. Here’s an interactive to look.

Watching the growth of Walmart – now with 100% more Sam’s Club

The ever so popular Walmart growth map gets an update, and yes, it still looks like a wildfire. Sam’s Club follows soon after, although not nearly as vigorously.

Years You Have Left to Live, Probably

The individual data points of life are much less predictable than the average. Here’s a simulation that shows you how much time is left on the clock.

Real Chart Rules to Follow

There are rules—usually for specific chart types meant to be read in a specific way—that you shouldn’t break. When they are, everyone loses. This is that small handful.