• What America spends on food and drink

    May 13, 2010  |  Statistical Visualization

    What America spends on food and drink

    How much more (or less) money do you spend on groceries than you do on dining out? How does it compare to how others spend? Bundle, a new online destination that aims to describe how we spend money, takes a look at the grocery-dining out breakdown in major cities. The average household in Austin spends the most money on food per year, period. Atlanta has the highest skew towards spending on dining out at 57%. The US average is 37%.
    Continue Reading

  • Driving habits and gas prices shift into reverse

    May 11, 2010  |  Statistical Visualization

    driving

    Hannah Fairfield of the New York Times looks at driving habits and gas prices over the past six decades. Miles driven per capita is on the horizontal, and the adjusted price of gasoline is on the vertical. The drawn path indicates order in time.

    Americans have driven more miles every year than the year before, almost every year, but there's been a swing as of late. High unemployment has meant less people driving to work, and less consumer spending means less freight moving across the country. As a result, the path appears to swing in the opposite direction.

    [Thanks, Craig]

  • Streamgraph code ported to JavaScript

    May 7, 2010  |  Software, Statistical Visualization

    stream4

    Lee Byron open-sourced his streamgraph code in Processing about a month ago. Jason Sundram has taken that and ported it to JavaScript, using Processing.js.

    The algorithms are the same as that in the original, but of course the natural benefit is that people don't need Java to run it their browsers. Jason has also added a few features including dynamic sizing, more straightforward settings, and some interaction with zoom and hover control. Really nice work.

    Grab the code, plus examples on GitHub.

    [Thanks, Jason]

  • Tax brackets over the past century

    April 27, 2010  |  Statistical Visualization

    Stephen Von Worley's Weather Sealed is one of my new favorites. In his most recent graphic, income tax brackets for individuals are displayed.

    The colors indicate the marginal tax rate: black for low, red in the middle, and yellow for high. The horizontal axis is the tax year, and the vertical represents taxable income, log-scale, normalized to 2010 dollars with the Bureau Of Labor Statistics’ monthly CPI-U figures. The bracket data comes from The Tax Foundation and the IRS, and the effects of Social Security, capital gains, AMT, and other tax varieties are not included.

    Through most of the century, brackets were much closer to a continuous scale. There was a big shift in thought though in the 1980s, when Ronald Reagan was elected president. The brackets became much more distinct. The idea has more or less stuck over the past two decades.

    Of course what sticks out the most is the 90% income tax during the mid-1900s. Earn $10 million. Give the man $9 million of it. That seems sort of, uh, wrong. The range between lowest and highest is also really big at 70 percentage points. It's only a small difference of 15 percentage points nowadays. Much better.

    Update: As noted in the comments, my knowledge of tax brackets is amazing. I should be a CPA. Here's the corrected math. The amount you earned over $10 million in 1950 is what would get taxed 90%. So if you earned $11 million, $900,000 of the last million would go to the man. Subsequently, the first $20,000 would be taxed 20%, then the next lump 30%, so on and so forth. Thanks, all.

  • March Madness Bracketology

    March 30, 2010  |  Statistical Visualization

    ncaa.jpg.scaled.1000

    The Final Four is just about here. Who's going to win it all? It's anyone's guess at this point, but what we can do while we wait is examine who's won in the past. Leonardo Aranda takes a gander at who has won in each round since 1985, by ranking, with a color-coded bracket that resembles a stacked area chart.

    I think if he used just two colors per corner (instead of entire palettes) and brightness indicating rank, it might be a bit easier to read in the first rounds. At the very least, you could find the Cinderella stories quicker, which is the most exciting part of the tournament a lot of the time.

    I still like the concept though. It reminds me of Stephen's crayon colors.

    See the full-sized version here.

    Who's your money on?

    [Thanks, Leonardo]

  • Statistical Atlas from the ninth Census in 1870

    March 16, 2010  |  Statistical Visualization

    In 1870, Francis Walker oversaw publication of the United States' first Statistical Atlas, based on data from the ninth Census. It was a big moment for statistics in the United States as the atlas provided a way to compare data on a national level using maps and statistical graphics.

    What continues to amaze me about these old illustrations is the detail - all done by hand. That's ridiculous. The kicker is that a lot of this stuff looks way better than a lot of what we see nowadays. Here are some selections from the 1870 atlas.
    Continue Reading

  • Canada: the country that pees together stays together

    March 9, 2010  |  Statistical Visualization

    flush_game

    EPCOR, the water utility company that runs the fountains up in Edmonton, Canada released this graph yesterday. It's water consumption during the Olympic gold medal hockey game, overlaying consumption of the previous day. How much do Canadians love their hockey? A lot.

    The first period ends. Time to pee. The second period ends. Time to pee. The third period ends. Time to pee. Consumption goes way down when Canada wins and during the medal ceremony.

    Finally, when it's all said and done, the rest of the country can relieve itself, figuratively and literally.

    [via contrarian | thanks, @statpumpkin]

  • Challenge: make this graph easier to read

    February 25, 2010  |  Discussion, Statistical Visualization

    The Economist discusses the return of big government and includes this graph showing total government spending as a percentage of Gross Domestic Product. We see a dip in 2000 and a big jump this past year.

    The trouble is that the country labels are cluttered. If you read them left to right, you get mixed up initially. Keep your eyes left and move top to bottom, and you might be okay.

    The Challenge

    Can you think of a way to make this graph easier to read? Is there a better way to represent the time series?

    One catch: you have to work within the size limitation of 290 pixels wide and 300 pixels tall. It's an easy fix with unlimited space. But what can you do when space is scarce? Leave your thoughts in the comments below.

    P.S. I was looking for the data this graph uses but got tired of using the OECD stat browser, so we'll just have to use our imagination for this one.

    [Thanks, Justin]

    Update: Here's GDP (sans spending) by country from 1995 to 2008 if anyone would like to take a wack [thanks, Kim].

  • An Exploration of Biological Records

    February 25, 2010  |  Statistical Visualization

    The Natural Science Museum of Barcelona has a growing database of 50,000 records of specimens collected over the past 150 years. Bestiario explores this data in their biodiversity treemap and geographical map.
    Continue Reading

  • Road to Recovery – Is the Recovery Act working?

    February 17, 2010  |  Statistical Visualization

    jobs_graph_large_feb10

    The Obama administration just posted a graph showing monthly job loss from December 2007 (Bush in red) up to last month. Discuss.

    [via @nickbilton]

    Update: There's a video version now [via infosthetics].
    Continue Reading

  • Build Online Visualization for Free with Tableau Public

    Tableau Software, popular for making data more accessible, mainly in the business sector, just opened up with Tableau Public. The application is similar in spirit to other online data applications like Many Eyes and Swivel. It lets you share data and visualizations online. However, Tableau Public doesn't have a central portal or a place to browse data. Rather it's focused on letting you explore data and stitch modules together on your desktop and then embed your findings on a website or blog.
    Continue Reading

  • Obama’s Budget Proposal and Incorrect Forecasts

    February 1, 2010  |  Statistical Visualization

    President Obama announced his 2011 budget proposal. How does it compare to last year's budget? Shan Carter and Amanda Cox of The New York Times compare the two plans. Red indicates a decrease in the percentage of the budget dedicated to the respective area, and green is for growth. Zoom in for a better view of the smaller areas.
    Continue Reading

  • Build Statistical Graphics Online With ggplot2

    Statisticians are generally behind the times when it comes to online applications. There are a lot out-dated Java applets and really rough attempts at getting R, a statistical computing environment, in some useful form through a browser. So imagine my surprise when I tried this tool by Jeroen Ooms, a visiting scholar at UCLA Statistics.

    It actually works pretty well, and for a prototype, it isn't half bad.
    Continue Reading

  • How to Make an Interactive Area Graph with Flare

    December 9, 2009  |  Statistical Visualization, Tutorials

    flare graph

    You've seen the NameExplorer from the Baby Name Wizard by Martin Wattenberg. It's an interactive area chart that lets you explore the popularity of names over time. Search by clicking on names or typing in a name in the prompt. It's simple. It's sexy. Everybody loves it.

    This is a step-by-step guide on how to make a similar visualization in Actionscript/Flash with your own data and how to customize the design for whatever you need. We're after last week's graphic on consumer spending (above).
    Continue Reading

  • Stat Charts Get a New York Times Redesign

    December 3, 2009  |  Statistical Visualization

    Statistical graphics are often... kind of bland. But that's fine, because they're usually for analysis, and the wireframe does just fine. The time eventually comes though when you need to present your analytical visualization in a paper or some slides, and you're no longer the primary reader.

    In their NYT op-ed on health care calculations, Andrew Gelman, Nate Silver, and Daniel Lee had some graphics of their own that needed some NYT flavor and design treatment.
    Continue Reading

  • The Cost of Getting Sick

    November 23, 2009  |  Statistical Visualization

    GE and Ben Fry (now the director of SEED visualization), show the cost of getting sick, from the individual's and insurer's perspective. The data: 500k records from the Medical Expenditure Panel Survey from GE's proprietary database. The visualization: a polar area pie chart.
    Continue Reading

  • Buzzwords in Academic Papers (Comic)

    November 20, 2009  |  Statistical Visualization

    phd111609s

    This comic was really amusing, although it might be because I'm a big nerd entertained by all things from PHD Comics...

    It's my blog, and I can laugh if I want to.

    Have a nice weekend, everyone.

    [Thanks, Stephen]

  • Unemployment Rate For People Like You – NYT Interactive

    November 9, 2009  |  Statistical Visualization

    Shan Carter, Amanda Cox, and Kevin Quealy of The New York Times explore 12-month average unemployment rates for just about any breakdown you can imagine. The main point: not everyone has been affected by the recession equally, and here's how each group has felt it.

    Start with the filters up top for race, gender, age, and education level. The corresponding time series highlights blue.

    Change the filters - and here's where the graphic gets a lot of mileage - the lit line moves up or down and the vertical axis updates, depending on what you were originally looking at. That up and down movement makes comparison between demographic groups much easier, especially because there are so many time series on a single plot.

    I'm impressed, NYT. Again.

  • Facebook Measures Happiness in Status Updates

    October 5, 2009  |  Statistical Visualization

    happiness-facebook

    As we all know, Facebook lets people update their friends with status updates, and with millions of users, that's a lot of data. Look at the aggregated data over time, and you could see some interesting trends.

    The Facebook Data Team recently measured happiness in the United States based on these updates with a metric they call United States Gross National Happiness.
    Continue Reading

  • 3 In-depth Views of Flight Delays and Cancellations

    September 10, 2009  |  Statistical Visualization

    Have you ever rushed to the airport only to find that your flight was delayed or canceled?

    In the most recent Data Expo at the annual Joint Statistical Meetings, data heads explored 120 million departures and arrivals in the United States, with the goal of finding "important features" such as:

    • When is the best time of day/day of week/time of year to fly to minimise delays?
    • Do older planes suffer more delays?
    • How does the number of people flying between different locations change over time?
    • How well does weather predict plane delays?

    While there were several interesting entries, here are the first, second, and third place winners. Continue Reading

Unless otherwise noted, graphics and words by me are licensed under Creative Commons BY-NC. Contact original authors for everything else.