• As a resident at Eyebeam, Alexander Chen visualizes the first Prelude from Bach’s Cello Suites:

    Using the mathematics behind string length and pitch, it came from a simple idea: what if all the notes were drawn as strings? Instead of a stream of classical notation on a page, this interactive project highlights the music’s underlying structure and subtle shifts.

    Interaction version here. Charming.

    [Alexander Chen via @blprnt]

  • This has been sitting in my drafts folder for a few months. Figured…

  • Quick announcement: I have a handful of signed Visualize This copies available in case you’re looking for a gift for that data geek cousin or you’re up for some learning over the holidays. I only have a limited supply, so grab a copy before they’re gone. And of course, you can still get an untarnished version at the major booksellers.

  • During the riots in London this past summer, a lot of information spread quickly about what was going on. Some of that information was true and some was not so true. The Guardian explores this spread of information on Twitter, and how fact and fiction seem to reveal themselves on their own:

    A period of unrest can provoke many untruths, an analysis of 2.6 million tweets suggests. But Twitter is adept at correcting misinformation – particularly if the claim is that a tiger is on the loose in Primrose Hill.

    Other rumors include when rioters cooked their own food at McDonald’s (false), London Eye was set on fire (false), and Miss Selfridge was set on fire (true).

    Each bubble represents a tweet and is sized by number of followers the tweeter has. The big one is usually the orignal tweet and the small ones that cluster around are retweets. Then the colors represent tweets that support, oppose, question, or comment. So when you play the animation for each rumor, bubbles swiftly pop up at the rumor peaks and then settle at true or false.

    You can also use the scroll to move to a certain point in time, and roll over bubbles to see the tweets.

    Really nice graphic and worth a look.

    [Guardian via @jakeporway]

  • As part of their series on road accidents, BBC News mapped every recorded death on the road in Great Britain, from 1999 to 2010. That’s 2,396,750 road crashes. As you’d expect, the map looks a lot like population density, but check out the videos, which show twelve years of data compressed as if it were one week, played out over a few minutes. Each light represents an accident.

    Contrast with road fatalities in the United States.

    Update: The BBC headline and copy seem to conflict, but this seems to be just accidents, and I’m not sure when casualties enter the equation. At 2.4 million crashes over 12 years, that’s about 455 per day.

    [BBC News via @aaronkoblin]

  • Famed statistician John Tukey created the boxplot in 1970. It shows a distribution summary in a small amount of space. Hadley Wickham and Lisa Stryjewski look back on the old standby and its evolution up to present. Keep it in mind, while still used today, the boxplot was created with pencil and paper.

    One of the original constraints on the boxplot was that it was designed to be computed and drawn by hand. As every statistician now has a computer on their desk, this constraint can be relaxed, allowing variations of the boxplot that are substantially more complex. These variations attempt to display more information about the distribution, maintaing the compact size of the boxplot, but bringing in the richer distributional summary of the histogram or density plot. These plots can overcome problems in the original such as the failure to display multi-modality, or the excessive number of “outliers” when n is large.

    Alright, computers are useful. I guess.

    [40 years of boxplots]

  • Form design intern at Fathom, James Grady, maps population density in Dencity:

    Dencity maps population density using circles of various size and hue. Larger, darker circles show areas with fewer people, while smaller, brighter circles highlight crowded cities. Representing denser areas with smaller circles results in additional geographic detail where there are more people, while sparsely populated areas are more vaguely defined.

    While we’ve seen population density mapped, both directly and indirectly, the circle approach adds a different aesthetic that seems to add something about what it’s like to live somewhere. Compare to a broader country-level map or one that uses only color. Doesn’t this version feel like more?
    Read More

  • Project Stimmungsgasometer (say what?) is a giant smiley face that changes based on the mood of Berlin citizens. When they are collectively “happy” the light is a smile, and when they are not, it is a sad face. Input comes from facial recognition software that takes in video from a strategically placed camera. The software estimates whether passers by are happy or not, and then installation changes accordingly.
    Read More

  • Shan Carter, who makes interactive graphics for The New York Times, talks telling stories with data in his aptly named presentation, “How I tried for years to find the perfect form for interactive graphics, how I failed, and why, whether a perfect form exists or not, I’ve stopped my desperate pursuit.”

    He starts with finding a balance between statistical analysis and story, and then finishes with the kicker that visualization is a form of communication just like a movie or a book. And that carries with it its own implications.

    The short Q&A at the end is pretty good, too. Just ignore the first obligatory question on how you make graphics that get more traffic.

    [Video Link via @mericson]

  • Testing the idea of six degrees of separation, first proposed by Frigyes Karinthy, the Facebook Data Team and researchers at the Università degli Studi di Milano found that most of us are connected by even fewer degrees, and average separation is getting smaller:

    While we will never know if it was true in 1929, the scale and international reach of Facebook allows us to finally perform this study on a global scale. Using state-of-the-art algorithms developed at the Laboratory for Web Algorithmics of the Università degli Studi di Milano, we were able to approximate the number of hops between all pairs of individuals on Facebook. We found that six degrees actually overstates the number of links between typical pairs of users: While 99.6% of all pairs of users are connected by paths with 5 degrees (6 hops), 92% are connected by only four degrees (5 hops). And as Facebook has grown over the years, representing an ever larger fraction of the global population, it has become steadily more connected. The average distance in 2008 was 5.28 hops, while now it is 4.74.

    So when you see random strangers, shake their hands and say hello. You’re practically best friends.

    Too bad there isn’t an interactive we can enter random names on to see how close we are.

    [Facebook]

  • Cathy O’Neil on when there’s enough data to justify a data scientist in the workplace:

    Too much to fit on an Excel spreadsheet. And it’s not just how much, it’s really about how high quality the data is; the best is for it to be clean and for it to not be public, or at least not generally used for the purpose that your business uses it for.

    Even data that does fit in Excel can be examined more closely. Then again, if you only have that much data, your data scientist will get bored quickly.

    [VentureBeat]

  • For The Guardian, ITO World maps about 370,000 road-related deaths from 2001 through 2009, according to the National Highway Traffic Safety Association. The map is kind of rough around the edges, but it gets the job done. Easily zoom in to the location of choice either by clicking buttons, or type in the area you want in the search box. Zoom in all the way, and you’ll notice each accident is represented by an icon indicating type of accident, the age of the person who died, and year of crash.

    As you might expect, accidents are more concentrated at city centers and on highways. What I didn’t expect was all the pedestrians involved.

    [Guardian and ITO World]

  • There’s so much emphasis and attention on Black Friday, the day of sales after Thanksgiving in the states. People line up for hours before stores open at midnight in hopes that they’ll be able to get the best deal, but it looks like Black Friday isn’t even the day to get the best deals:

    For higher-end electronics, Mr. de Grandpre’s trends show, shoppers should wait until the week after Thanksgiving.

    “Black Friday is about cheap stuff at cheap prices, and I mean cheap in every connotation of the word,” Mr. de Grandpre said. Manufacturers like Dell or HP will allow their cheap laptops to be discounted via retailers on that Friday, but they will reserve markdowns through their own sites for later.

    When later? Cyber Monday is a good day to buy.

    On a whim, we found ourselves at a midnight Black Friday at the mall. I was like, “Eh, it shouldn’t be that busy this late at night.” So wrong. The avoidance of large crowds is enough of an incentive for me to wait. Although if I were a young, teenage girl in the market for a nice pair of boots, I suppose I might sing a different tune.

    [New York Times via @drewconway]