• Thessaly La Force, with illustrator Jane Mount, recently published My Ideal Bookshelf, which is a look into the books that some people of interest, including Judd Apatow, Chuck Klosterman, and Tony Hawk, would like to have on their ideal bookshelf. La Force’s boyfriend took a more data-centric look at the collections.

    In the network above, each node is a person who listed their ideal books, and connections represent people who named the same books. Those in the center of the network had more book similarities than those on the edges. For example, James Franco named a ton of books and as you might expect has a bunch of connections. [via @shiffman]

  • By now, everyone’s heard of Moneyball. Applying statistics to baseball to build the best team for the buck. Naturally, there’s a lot of interest these days in applying the same data-based philosophy to other sports. Jennifer Fewell and Dieter Armbruster used network analysis to model gameplay in basketball.

    To analyze basketball plays, Fewell and Armbruster used a technique called network analysis, which turns teammates into nodes and exchanges — passes — into paths. From there, they created a flowchart of sorts that showed ball movement, mapping game progression pass by pass: Every time one player sent the ball to another, the flowchart lines accumulated, creating larger and larger and arrows.

    Using data from the 2010 playoffs, Fewell and Armbruster’s team mapped the ball movement of every play. Using the most frequent transactions — the inbound pass to shot-on-basket — they analyzed the typical paths the ball took around the court.

    The challenge with basketball is that play is continuous, whereas baseball events are discrete, so you can’t apply the same methods. But if you can model the game properly, you know where to optimize and areas that need work.

  • As 2013 nears, let the recaps, reviews, and best ofs begin. Twitter put up their 2012 year in review of top tweets, trends, and such, which is mostly pictures and lists, but in collaboration with Vizify, they also have a section to visualize your own tweets. Click on the “View year on Twitter” button in the top right. Here’s mine, for example. (Surprise, I mention maps, data, and charts often.)

    It’s a word frequency chart that shows usage over the year. Scroll left to right or mouse over bubbles to see specific tweets. Mostly, it’s just fun to look back. [Thanks, Todd]

  • This one’s for you Game of Thrones fans and aficionados. Jerome Cukier visualized groups of people, from Lannisters to Starks, and kills throughout the books. Each circle represents a character and is sized by number of appearances. Color represents status, and connecting lines are killer-killee relationships (aw, so sweet). The best part is that this all plays out over time.

  • From machine learning to data mining. From statistics to probability. A lot of it seems similar, so what are the differences? Statistician William Briggs explains in an FAQ.

    What’s the difference between machine learning, deep learning, big data, statistics, decision & risk analysis, probability, fuzzy logic, and all the rest?

    None, except for terminology, specific goals, and culture. They are all branches of probability, which is to say the understanding and sometime quantification of uncertainty. Probability itself is an extension of logic.

    I was surprised he didn’t throw data science into the mix, but you could and the document would pretty much be the same.

  • Andrew Barr and Richard Johnson for the National Post took a detailed look at the who, what, and when of Walking Dead kills.

    While AMC lets The Walking Dead gang take a short mid-season break — the Post’s Andrew Barr
    and Richard Johnson look at a few of the key statistics of two-and-a-half season’s worth of undead mayhem. They find noteworthy — the gradual increase in the body count, the increasingly creative means of Zombie dispatch, and the fact that every character seems to have developed a clear enjoyment for putting the ambulatory cadavers down for good.

    They also included weapons used, ranging from handgun to tree branch. See the full version here. Somewhere there’s a piece of paper with a ton of tally marks on it.

    [Thanks, Thomas]

  • James Grady from Fathom Information Design had a look at the family tree of All in the Family, a popular television from the 1970s:

    All in the Family was the origin of seven spin-off shows that aired between the early ’70s and the mid-’90s: Maude, Good Times, The Jeffersons, Checking In, Archie Bunker’s Place, Gloria, and 704 Hauser.

    In tribute to nostalgia, the end of fall and its beautiful colors, and my fascination with retro TV shows, I’ve created All in the Family Tree, an interactive visualization of all the characters from each of the eight shows listed above. Each character is represented by a leaf and each show is indicated by a separate color. A branch line connects a character’s crossover from original show to spin-off and vice versa.

    It’s a charming piece that’s sure to bring back good memories for anyone who watched the shows. I was too young to appreciate them at the time, and all I can remember is the opening sequence of The Jeffersons. I think they were moving on up. To the east side.

  • Max Fisher for the Washington Post mapped country emotion ratings, based on the results of a recent Gallup study. Singapore was ranked least emotional, whereas the Philippines was ranked most emotional. The United States was also relatively high. From Gallup:

    While higher incomes may improve people’s emotional wellbeing, they can only do so to a certain extent. In the United States, for example, Nobel Prize-winning economist Daniel Kahneman and Princeton economist Angus Deaton found that after individuals make $75,000 annually, additional income will have little meaningful effect on how they experience their lives. Consider this finding in the context of Singapore, a country with one of the lowest unemployment rates and highest GDP per capita rates in the world, but a place where residents barely experience any positive emotions. This research shows that it will take more than higher incomes to increase positive emotions or decrease negative emotions. Singapore leadership needs to consider strategies that lie outside of the traditional confines of classic economics and would be well-advised to include wellbeing in its overall strategies if it is going to further improve the lives of its citizenry.

    I’m curious about what we’re seeing here though. The research infers wellbeing, but the survey was done by phone and face-to-face. Did Americans call overseas, or did residents call other citizens? The former might be kind of weird for some.

    More importantly though, they asked questions like “Did you smile or laugh a lot yesterday?” and “Did you experience enjoyment?” Some cultures just don’t express emotions, but it doesn’t mean they don’t feel them. (Read as: I’m not a robot! I have feelings, too!)

    [Thanks, John]

  • In 1979, Atari released Lunar Lander, a game whose object was to land a module safely on the moon. Digital artist Seb Lee-Delisle turned the game into an installation in which you play the game, and your paths are drawn on a wall by a hanging robot. The result, a unique trace of players’ paths in the game, is quite nice.

    I’m surprised we haven’t seen more video game-based pieces likes this. The only one that comes to mind is the Just Cause 2 point cloud, which showed 11 million player deaths. It revealed terrain and gameplay mechanics. There’s also this graphic that shows what buttons to push to beat Super Mario Brothers 3, but that doesn’t really count. It’d be fun to see the direct path of a Mario expert versus a novice path that doubles back and ends early. Pac-Man might be a fun one to see, too. Yeah, let’s do that.

  • Jer Thorp talks ethics in the data-as-new-oil metaphor:

    [W]e need to change the way that we collectively think about data, so that it is not a new oil, but instead a new kind of resource entirely. For this to occur we need to foster a deep understanding of data in society. As it happens, humanity has a mechanism for this kind of broad cultural change: the arts. As we proceed towards profit and progress with data, let us encourage artists, novelists, performers and poets to take an active role in the conversation. In doing so we may avoid some of the mistakes that we made with the old oil.

    See also: Jer’s talk on the human side of data.

  • CartoDB mapped every Rolling Stones tour from 1963 to 2007.

    The Stones passed the half-century mark as a band this year. An incredible achievement for an incredible band. They also happen to be one of the most prolific touring bands in the world with more than 1,300 concerts all over the world, and over the last 50 years they have have traveled almost 1,000,000 Km (960,000 km actually).

    Made with the newly introduced CartoDB 2.0, with added support for MapBox, more mapping capabilities, and a JavaScript API.

  • A research paper version of Noah Kalina’s photo project by Timothy Weninger. Weninger saved versions of the paper at various stages of writing, and strung them together in a time-lapse video. It reminds me of Ben Fry’s On the Origin of Species.

    I wish I had done this with my dissertation. [via @revodavid]

  • Mike Bostock, Matthew Ericson and Robert Gebeloff for the New York Times explored changing tax rates from 1980 to 2010, for various income levels.

    Most Americans paid less in taxes in 2010 than people with the same inflation-adjusted incomes paid in 1980, because of cuts in federal income taxes. At lower income levels, however, much of the savings was offset by increases in federal payroll taxes, state sales taxes and local property taxes. About half of households making less than $25,000 saved nothing at all.

    Instead of trying to squeeze everything into one space, the graphic reads like a story, with changes in different types of taxes and comparisons across income levels.

  • When you plan pinball, the ball bounces around creating paths for itself and the better you play, the more control you have over those paths. Recent design graduate Sam van Doorn modified a machine so that you can see those paths in his project STYN. A poster is placed underneath the flippers, and the ball gets a douse of paint on the way out, so you get a unique sketch each time you play. [via infosthetics]

  • With Google’s driverless cars now street legal in California, Florida, and Nevada, Gary Marcus for the New Yorker ponders a world where machines need a built-in morality system.

    That moment will be significant not just because it will signal the end of one more human niche, but because it will signal the beginning of another: the era in which it will no longer be optional for machines to have ethical systems. Your car is speeding along a bridge at fifty miles per hour when errant school bus carrying forty innocent children crosses its path. Should your car swerve, possibly risking the life of its owner (you), in order to save the children, or keep going, putting all forty kids at risk? If the decision must be made in milliseconds, the computer will have to make the call.

    Data analysis seems to be headed in the same direction. Where machines will have to start making human-like decisions, data represents more of the real world and looks less like snippets in time. As the gap between numbers and what they represent shrinks, the more we have to think about ethics, privacy, and whether or not what we’re doing is right.

  • Ben Welsh, Robert Lopez, and Kate Linthicum for the Los Angeles Times analyzed more than a million runs by the Los Angeles Fire Department to estimate response times, based on where you live. The national standard is six minutes. The map shows average response times that are greater in red and those that are under in green (basically, anywhere there is a fire department).

    The lead-in mentions that LAFD leaders have said that they routinely fail to meet the national standard, but if you’ve driven in Los Angeles, it’s not hard to imagine why it takes those extra minutes. I wonder how this compares to other high-traffic cities.

  • Now that we’re done giving thanks for all the intangibles like love, friends, family, and drunkenness, it’s time to turn our attention to the physical objects we don’t have yet. It’s the most wonderful time of year! Here are gift ideas for your data geek friends and family. A few of these take a while to make, so be sure to order them now so that you get them in time for Christmas.
    Read More

  • Studio NAND and Moritz Stefaner, along with Jens Franke explore FIFA development programs around the world.

    The FIFA Development Globe visu­al­ises FIFA’s world­wide involve­ment in supporting foot­ball through educa­tional and infra­struc­tural projects. Using a 3D globe in combin­a­tion with inter­con­nected inter­face and visu­al­iz­a­tion elements, the applic­a­tion provides multiple perspect­ives onto an enormous dataset of FIFA’s activ­ities, grouped by tech­nical support, perform­ance activ­ities, and devel­op­ment projects.

    The globe itself is an icosahedron, or essentially a spherical shape made up of triangles. Triangles in each country represent programs and are colored by the three above categories, and you might recognize Moritz’ elastic lists in the sidebar to filter through programs, by country, organization, and type. There’s also a timeline view, which shows program development over the past five years.

    Give it a go here. I should warn you though that it runs in Flash (a client requirement), and it could run sluggish depending on your machine. Sometimes I was disorientated by the interaction and animation, especially when I clicked and nothing happened until a few seconds later.