• Mozilla Labs just released a bunch of anonymized browsing data for their open data visualization competition:

    This competition is based on Mozilla’s own open data program, Test Pilot. Test Pilot is a user research platform that collects structured user data through Firefox. All data is gathered through pre-defined Test Pilot studies, which aim to explore how people use their web browser and the Internet.

    There are two datasets in various formats. The first is browsing behavior from 27,000 users, including on/off private browsing that we saw a few months ago. The second dataset is from 160,000 users and is on how they actually use the Firefox interface.

    Additionally, both sets have survey answers to questions like “How long have you used Firefox?” which could make for some fun and interesting breakdowns.

    The deadline is December 17.

    [Mozilla Labs]

  • How did you get to where you are now in your work life? What about Barack Obama? Ashton Kutcher? Jon Stewart? In a collaboration between Newsweek and Bocoup, the Career Tree displays your LinkedIn profile (or a handful of celebrities) as a budding network.
    Read More

  • Professor of Mathematics at Temple University, John Allen Paulos describes the differences between statistics and stories:

    [T]here is a tension between stories and statistics, and one under-appreciated contrast between them is simply the mindset with which we approach them. In listening to stories we tend to suspend disbelief in order to be entertained, whereas in evaluating statistics we generally have an opposite inclination to suspend belief in order not to be beguiled.

    And he concludes:

    The focus of stories is on individual people rather than averages, on motives rather than movements, on point of view rather than the view from nowhere, context rather than raw data. Moreover, stories are open-ended and metaphorical rather than determinate and literal.

    Which way do we go when we start telling stories with data?

    [New York Times via @joandimicco]

  • Happy Thanksgiving! Eat lots and lots and lots. Rest. Then eat more.…

  • Something for you leading into the Thanksgiving weekend. Enjoy this short video (below) by Amy Thornley, which she made for smile for london. It was animated frame-by-frame in Excel. I look forward to when someone makes this an actual game. That way you could play it at the office, and it would look like you’re working when you pause it.
    Read More

  • How to Make Bubble Charts

    Ever since Hans Rosling presented a motion chart to tell his story of the wealth and health of nations, there has been an affinity for proportional bubbles on an x-y axis. This tutorial is for the static version of the motion chart: the bubble chart.

  • In case you missed it, Girl Talk recently released his fifth album All Day, which samples from 372 songs. Essentially, it’s an album of mashups, so together, samples from multiple songs combine to make a single song. @brahn shows what samples are playing at any given time as you listen to the album. Press play, and the current samples highlight.

  • In a follow-up to their puzzle to balance the budget, The New York Times shows the top selections that about seven thousand Twitter users made. It’s not a scientific sample, as it’s only Twitter users, but interesting to look at nevertheless with a number of useful breakdowns.

  • Political science PhD candidate David Sparks has look at the evolution of the two-party vote:

    Using county-level data, I spatially and temporally interpolated presidential vote returns for the two major party candidates in each election from 1920-2008. The result illuminates the sometimes gradual, sometimes rapid change in the geographic basis of presidential partisanship.

    Read More

  • In work with the American Human Development Project, Rosten Woo and Zachary Watson map the Human Development Index, along with many other indicators in this thorough interactive.
    Read More

  • The scales for what qualifies as pretty and useful change depending on the application and purpose, but you always aim for the same quadrant. The best data graphics come from those who are able to find the right balance between aesthetics and utility.

  • We’ve seen this sort of thing before, with tweets mapped and such, but the recent A World of Tweets by Frog Design is nicely executed (in HTML5).

    A World of Tweets is all about playing with geography and bits of information. Simply put, A World of Tweets shows you where people are tweeting at from the past hour. The more tweets there are from a specific region, the “hotter” or redder it becomes.

    You can toggle between a few different views such as smokey or heatmap, or outline or satellite view, but the highlight has gotta be the 3d view. Unfortunately, I don’t have any red and blue paper lens glasses on me. Dang it.

  • TenderMaps brings an informal approach to highlighting the parts of neighborhoods:

    We wanted to move from the static and singular, toward more dynamic, subtle definitions of neighborhoods, definitions emphasizing the nuanced communities and personal experiences that really shape a neighborhood’s boundaries. We wondered how we could we harness the implicit mental maps people actually use. What would happen if we defined a neighborhood by the way we moved though it, or by the places we loved in it?

    In this first iteration, the creators walked around the Tenderloin in San Francisco, and asked residents questions about their neighborhood and to sketch on a paper map. The sketches were scanned to make a browsable map, including backstories of each scribble.

    The proof of concept is still rough, and uber slow in Chrome, but it should be able to see how useful this might be. It’s much more personal than the markers we are used to seeing and could be a way for non-tech people to see their community in a tech way.

    [TenderMaps via @zainy]

  • In this interactive, USA Today guesses your age, based on what influenced you as a teenager:

    The year you were born partly determines what generation you belong to, but so do your cultural experiences. The chart below shows the offset from birth years to one’s teenage years — when people are most influenced by the world around them — and the music, movies, TV, news, fashion, technology, toys and sports of those eras.

    Simple and entertaining. I took the quiz twice, and it was different each time. It was one year off both times, so dead on when you take the average. What generation do you belong to?

    [USA Today]

  • This Forbes post on the greatness that is R is being passed around by every statistician and his mother today.

    It’s not that this type of analysis wasn’t possible before — statisticians have existed, and commercial software has been available to support them, for decades. The fact that R is free to use, free to modify, and its source is open to view, extend and improve means students, stock traders-in-training and fantasy football junkies can familiarize themselves with the software. They can write programs against it. They’re likely to continue that usage into their professional lives. When they share their work, the community, down the line, benefits. And the virtuous cycle strengthens.

    What’s your favorite (graphical) use of R?

  • AT&T Labs’ Infoviz research group describes network graphs and their many uses:

    There is information in the connections. A glance is enough to identify nodes with the most links, nodes straddling different subgroups, and nodes isolated by their lack of connections. Corporations might look at a graph to verify that marketing and sales are communicating, urban planners to monitor the interconnectedness, or isolation, of neighborhoods, biologists to discover interactions between genes, and network analysts to monitor security.

    And on aesthetics:

    Aesthetics is important not so much for looks—though some visualizations can be stunning to look at—but for readability. Links that intersect and nodes that overlay one another result in poor readability, and graph visualization programs work hard to minimize the number of link intersections and give enough whitespace around each node to make it stand out from its neighbors.

    [AT&T Labs via TomC]

  • Wikipedia’s annual fundraiser is in progress. If you haven’t noticed already, when you go to the site, there’s a banner on the top that asks for donations. A few weeks ago, Wikipedia tested four different banners (below) to see which one resulted in the most donations, and they just posted the data for the test (along with some others). Can you visualize this?
    Read More

  • When we first learn how to deal with data in school, it’s nicely formatted and fits perfectly into a rectangular spreadsheet. Then when we start to deal with real data, we find missing values, inconsistencies, and for some reason it doesn’t plug straight into our software. What the heck?

    The caveman way to fix this problem is to open Excel and manually edit everything. Some ad hoc code can often fix your problems, but still that takes time and can be a pain. Google Refine, the Googley evolution of Freebase Gridworks, can help you.
    Read More

  • Matthew Ericson, deputy graphics director of The New York Times, dug through the archives to find the first occurrence of an election map in the paper, in 1896:

    The speed with which the results made it into print boggles the mind given the technology of the day (especially considering that in the last few elections in the 2000s, with all of the technology available to us, there have been a number of states that we haven’t been able to call in the Wednesday paper).

    What a beaut. That day, the paper cost 3 cents.

  • If you take away anything from The Visual Display of Quantitative Information, make it the epilogue. This is the most important part:

    Design is choice. The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper. Most principles of design should be greeted with some skepticism, for word authority can dominate our vision, and we may come to see only through the lenses of word authority rather than with our own eyes.

    When we first start out with data graphics, it is easy to read a list of rules about ratios, flourishes, and sizes, and then trick ourselves into believing that is all there is to it. But like cooking, writing, programming, painting, speaking, designing, sporting and numerous other things, you learn the basics first. The principles. And then you figure out what rules can bend and how far.