• Thomas H. Davenport and D.J. Patil give the rundown on what a data scientist is, what to look for and how to hire them. It’s an article in Harvard Business Review, so it’s geared towards managers, and I felt like I was reading a horoscope at times, but there are some interesting tidbits in there.

    Data scientists don’t do well on a short leash. They should have the freedom to experiment and explore possibilities. That said, they need close relationships with the rest of the business. The most important ties for them to forge are with executives in charge of products and services rather than with people overseeing business functions. As the story of Jonathan Goldman illustrates, their greatest opportunity to add value is not in creating reports or presentations for senior executives but in innovating with customer-facing products and processes.

    I still call myself a statistician. The main difference between data scientist and statistician seems to be programming skills, but if you’re doing statistics without code, I’m not sure what you’re doing (other than theory).

    Update: This recent panel from DataGotham also discusses the data scientist hiring process. [Thanks, Drew]

  • This month the Netherlands held national elections, and now that the results are in, interaction designer Jan Willem Tulp had a look at voting similarity between cities. I’m not sure what metric was used to judge similarity, but it looks like it was based on voting distributions for candidates.

    Each circle represents a city, and you can choose between a geographic layout or a radial one. When you select a circle, the others change size and color, where more red and larger means more similar. In the radial layout, circles that are farther are away are less similar. Be sure to look at the city of Urk in the radial layout. According to Tulp, it’s the most religious city, and it votes completely differently from the rest. [Thanks, Jan]

  • I’m not sure what I’d do with Ablaze.js, a JavaScript library by Patrick Gunderson, but the results are sexy. Play around with the app here. [via @jeffclark]

  • The Forest of Advocacy is a series of animations that explores the political contribution patterns among eight organizations, such as Bain Capital, Goldman Sachs, and Harvard Business School.

    These visualizations provide a dynamic look at the partisan tilt of giving within organizations. For each organization, individuals are characterized as points sketching out a line over time. The X axis is time, and the Y axis represents the net partisan tilt of contributions over the preceding 6 months. Over the decades, one sees lines sketched out, reflecting the partisanship of individuals over time. For each organization, we also provide the net contributions of the entire organization, and the names of biggest Democratic, Republican, and “bipartisan” contributors (the individual with the highest product of Democratic and Republican contributions).

    At the core, each animation is a time series chart, but the aesthetic and animation, which is narrated, provides for a more organic feel. In particular, the movements of people, represented by squares shifting straight across or up and down, makes it easy to see consistent and not so consistent contributions. [Thanks, Mauro]

  • Aaron Rueben and Gabriel Isaacman used data from sampling air in tunnels, where there are a lot of cars, to create unique soundscapes that represent the chemicals in the area.

    We created sounds from air samples (atmospheric particulate matter collected on filters) by first using gas chromatography to separate the thousands of compounds in the air (try it with markers at home) and then using mass spectrometry, which gives us a unique “spectrum” for chemicals based on their structure, to identify the compounds and assign them tones. Some compounds end up sounding clear and distinct, while others blur together into unresolvable chords. The result is a qualitative, sensory experience of hard, digital data. You can actually hear the difference between the toxic air of a truck tunnel (clogged with diesel hydrocarbons and carcinogenic particulate matter) and the fragrant air of the High Sierras.

    The audio above represents the air in the Caldecott Tunnel Oakland, California. Note the heavy hydrocarbons towards the end. Contrast that with the audio for a remote forest in the Sierras below.

  • Emily Chow, Ted Mellnik, and Karen Yourish for The Washington Post mapped where the candidates and their wives have visited since June in an interactive with filters and multiple views.

    On load, you see the visits of the eight, with a comparison between Democrats and Republicans. The map on top shows where, and the time series on the bottom shown when. Click on the map, and it zooms to show visits at city-level, and a click on a time slice updates a list of individual visits. Furthermore, you can select the individuals or categories for just the last 30 days, fundraisers, or your state.

    The interaction lets you narrow down quickly and easily to what you care about. The only other thing I would’ve liked to see is a tighter coupling between the time series and the map.

  • Alberto Cairo’s newly translated book on information graphics, The Functional Art, is a healthy mix of theory and how it applies in practice, and much of it comes from Cairo’s own experiences designing graphics for major news publications. (I don’t think Alberto remembers, but what seems like many years ago, I sat right behind him for two weeks at the New York Times when they brought him in to help illustrate Raphael Nadal’s approach to tennis.)

    His experience is hugely important in making the book work. There’s a growing number of books on information graphics, and many are written and illustrated by people who don’t have much experience displaying information, which leads to art books posing as something else. This isn’t one of those books. Cairo knows what he’s talking about.

    As you flip through, you’ll notice a lot of examples, with a focus on process and even a handful of pencil sketches. The last third of the book is interviews with those well-established in the field, which also walks you through how some graphics were made. There’s a strong undertone of finding the balance between function (e.g. efficiency and accuracy) and engagement (e.g. use of circles).

    Cairo comes from a journalism background, so the book is mostly in the context of presentation, but there’s of course plenty that you can apply to more exploratory graphics. I would say though that Cairo’s strength is in illustration and information, and so the book reflects that. This isn’t a book that covers visual data analysis or statistical concepts, but it is one that explores and describes the making of high quality information graphics that lend clarity to concepts and ideas. If you’re looking for the latter, The Functional Art is worth your time.

    Check out the sample chapter on the publisher page, but then grab it on Amazon and save a few bucks.