Same summary statistics, completely different plots

Posted to Statistics  |  Tags: , ,  |  Nathan Yau

Summary statistics such as mean, median, and mode can only tell you so much about a dataset. Their scope is limited because for them to be useful, you have to assume things like distribution and dependencies. Visualization helps you see what else there is.

Justin Matejka and George Fitzmaurice demonstrate in their paper for the ACM SIGCHI Conference, in which they developed a method to generate datasets that “are identical over a range of statistical properties, yet produce dissimilar graphics.

Favorites

Jobs Charted by State and Salary

Jobs and pay can vary a lot depending on where you live, based on 2013 data from the Bureau of Labor Statistics. Here’s an interactive to look.

Top Brewery Road Trip, Routed Algorithmically

There are a lot of great craft breweries in the United States, but there is only so much time. This is the computed best way to get to the top rated breweries and how to maximize the beer tasting experience. Every journey begins with a single sip.

Causes of Death

There are many ways to die. Cancer. Infection. Mental. External. This is how different groups of people died over the past 10 years, visualized by age.

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.