• During the Olympics, Studio NAND, Moritz Stefaner, and Drew Hemment tracked Twitter sentiment with Emoto. This interactive installation and data sculpture is the last leg of the project.

    The emoto data sculp­ture repres­ents message volumes, aggreg­ated per hour and senti­ment level in hori­zontal bands which move up and down according to the current number of Tweets at each time. This resulted in simpli­fied 3-dimensional surfaces which allows visitors to identify patterns in message frequency distri­bu­tion more easily. And while not being specific­ally designed in this direc­tion, the surfaces also nicely support haptic exploration.

    The sculpture itself is black and unchanging, and it’s used as a projection surface to display a heat map and overlay text. The projection is controlled by the user, which makes for an interesting blend of physical and digital.

  • A couple of years ago, xkcd ran a survey that asked people to name colors. Stephen Von Worley plotted that data by gender in an interactive.

    That’s a dot for each of the 2,000 most commonly-used color names as harvested from the 5,000,000-plus-sample results of XKCD’s color survey, sized by relative usage and positioned side-to-side by average hue and vertically by gender preference. Women tend to use color names nearer the top, men towards the bottom, and the dashed line represents the 50-50 split (equal usage by both sexes).

    While his original version was static, the interactive version lets you sort by hue, saturation, brightness, popularity, and name length. Most importantly, you can see the color names now when you mouse over. I like the vertical spectrum of purple, where women use names like bright lilac, orchid, and heather, and men tend to label similar shades as purplish, lightish purple, and oh yes, very light purple. [Thanks, Stephen]

  • COMING MAY 29

    Pre-order on Amazon
  • Thomas H. Davenport and D.J. Patil give the rundown on what a data scientist is, what to look for and how to hire them. It’s an article in Harvard Business Review, so it’s geared towards managers, and I felt like I was reading a horoscope at times, but there are some interesting tidbits in there.

    Data scientists don’t do well on a short leash. They should have the freedom to experiment and explore possibilities. That said, they need close relationships with the rest of the business. The most important ties for them to forge are with executives in charge of products and services rather than with people overseeing business functions. As the story of Jonathan Goldman illustrates, their greatest opportunity to add value is not in creating reports or presentations for senior executives but in innovating with customer-facing products and processes.

    I still call myself a statistician. The main difference between data scientist and statistician seems to be programming skills, but if you’re doing statistics without code, I’m not sure what you’re doing (other than theory).

    Update: This recent panel from DataGotham also discusses the data scientist hiring process. [Thanks, Drew]

  • This month the Netherlands held national elections, and now that the results are in, interaction designer Jan Willem Tulp had a look at voting similarity between cities. I’m not sure what metric was used to judge similarity, but it looks like it was based on voting distributions for candidates.

    Each circle represents a city, and you can choose between a geographic layout or a radial one. When you select a circle, the others change size and color, where more red and larger means more similar. In the radial layout, circles that are farther are away are less similar. Be sure to look at the city of Urk in the radial layout. According to Tulp, it’s the most religious city, and it votes completely differently from the rest. [Thanks, Jan]

  • I’m not sure what I’d do with Ablaze.js, a JavaScript library by Patrick Gunderson, but the results are sexy. Play around with the app here. [via @jeffclark]

  • The Forest of Advocacy is a series of animations that explores the political contribution patterns among eight organizations, such as Bain Capital, Goldman Sachs, and Harvard Business School.

    These visualizations provide a dynamic look at the partisan tilt of giving within organizations. For each organization, individuals are characterized as points sketching out a line over time. The X axis is time, and the Y axis represents the net partisan tilt of contributions over the preceding 6 months. Over the decades, one sees lines sketched out, reflecting the partisanship of individuals over time. For each organization, we also provide the net contributions of the entire organization, and the names of biggest Democratic, Republican, and “bipartisan” contributors (the individual with the highest product of Democratic and Republican contributions).

    At the core, each animation is a time series chart, but the aesthetic and animation, which is narrated, provides for a more organic feel. In particular, the movements of people, represented by squares shifting straight across or up and down, makes it easy to see consistent and not so consistent contributions. [Thanks, Mauro]

  • Aaron Rueben and Gabriel Isaacman used data from sampling air in tunnels, where there are a lot of cars, to create unique soundscapes that represent the chemicals in the area.

    We created sounds from air samples (atmospheric particulate matter collected on filters) by first using gas chromatography to separate the thousands of compounds in the air (try it with markers at home) and then using mass spectrometry, which gives us a unique “spectrum” for chemicals based on their structure, to identify the compounds and assign them tones. Some compounds end up sounding clear and distinct, while others blur together into unresolvable chords. The result is a qualitative, sensory experience of hard, digital data. You can actually hear the difference between the toxic air of a truck tunnel (clogged with diesel hydrocarbons and carcinogenic particulate matter) and the fragrant air of the High Sierras.

    The audio above represents the air in the Caldecott Tunnel Oakland, California. Note the heavy hydrocarbons towards the end. Contrast that with the audio for a remote forest in the Sierras below.

  • Emily Chow, Ted Mellnik, and Karen Yourish for The Washington Post mapped where the candidates and their wives have visited since June in an interactive with filters and multiple views.

    On load, you see the visits of the eight, with a comparison between Democrats and Republicans. The map on top shows where, and the time series on the bottom shown when. Click on the map, and it zooms to show visits at city-level, and a click on a time slice updates a list of individual visits. Furthermore, you can select the individuals or categories for just the last 30 days, fundraisers, or your state.

    The interaction lets you narrow down quickly and easily to what you care about. The only other thing I would’ve liked to see is a tighter coupling between the time series and the map.

  • Alberto Cairo’s newly translated book on information graphics, The Functional Art, is a healthy mix of theory and how it applies in practice, and much of it comes from Cairo’s own experiences designing graphics for major news publications. (I don’t think Alberto remembers, but what seems like many years ago, I sat right behind him for two weeks at the New York Times when they brought him in to help illustrate Raphael Nadal’s approach to tennis.)

    His experience is hugely important in making the book work. There’s a growing number of books on information graphics, and many are written and illustrated by people who don’t have much experience displaying information, which leads to art books posing as something else. This isn’t one of those books. Cairo knows what he’s talking about.

    As you flip through, you’ll notice a lot of examples, with a focus on process and even a handful of pencil sketches. The last third of the book is interviews with those well-established in the field, which also walks you through how some graphics were made. There’s a strong undertone of finding the balance between function (e.g. efficiency and accuracy) and engagement (e.g. use of circles).

    Cairo comes from a journalism background, so the book is mostly in the context of presentation, but there’s of course plenty that you can apply to more exploratory graphics. I would say though that Cairo’s strength is in illustration and information, and so the book reflects that. This isn’t a book that covers visual data analysis or statistical concepts, but it is one that explores and describes the making of high quality information graphics that lend clarity to concepts and ideas. If you’re looking for the latter, The Functional Art is worth your time.

    Check out the sample chapter on the publisher page, but then grab it on Amazon and save a few bucks.

  • As part of the Stories initiative that Facebook launched yesterday, an interactive map by Stamen Design shows how people are connected on Facebook, which offers a view into how countries are linked by language and history.

    Immigration is one of the strongest links that seems to bind these Facebook neighbors, as thousands of people pour over borders or over seas, seeking jobs or fleeing violence, and making new connections and maintaining old friendships along the way. Economic links, through trade or investment, also seem to be strong predictors of country connectedness. And finally, one of the most overwhelming trends we found as we explored this graphic is the strong tie that remains between nations and their former colonizers, whose continued linguistic, cultural, and economic ties still echo today.

    Stamen also explained other interesting facets in the map.

    When you click on a country, the map updates to show where friends of those in that country are from. The top five are labeled. So whereas previous Facebook maps showed all connections at once, which focused on how many people use the service, this one focuses on the actual connections and what they mean.

  • Kat Downs, Laura Stanton and Karen Yourish of The Washington Post look at the tax breaks from the 1970s to 2011 in an interactive.

    The U.S. government gives away more than $1 trillion a year in tax breaks — subsidies for individuals and companies that are often substitutes for direct government spending.
    Once written into the tax code, they tend to stick around.

    Each stripe represents a tax break, and height represents the value of the break in 2011. Interaction is key here, which lets you select categories such as education and health and mouse over breaks for more information. The chart above is also linked with a time series, which provides an alternative view to the same data.

  • After he saw a New York Times article on the gender gap among Wikipedia contributors (The contributor base is only 13 percent women), Santiago Ortiz plotted articles by number of men versus number of women who edited. It’s interactive, so you can mouse over dots to see what article each represents, and you can zoom in for closer look in the bottom left.

    At first glance, the difference doesn’t look that big, but notice the values of the axes. The axis for men on the horizontal is from 0 to 200, the axis for women is 0 to 20, and the equal ratio line is the purple one that’s nearly vertical. So the only article with more women contributors is on cloth menstrual pads.

    See also: what the chart looks like with equally-spaced increments. The results are clear.

  • Nate Silver says the weatherman is not a moron.

    Still, most people take their forecasts for granted. Like a baseball umpire, a weather forecaster rarely gets credit for getting the call right. Last summer, meteorologists at the National Hurricane Center were tipped off to something serious when nearly all their computer models indicated that a fierce storm was going to be climbing the Northeast Corridor. The eerily similar results between models helped the center amplify its warning for Hurricane Irene well before it touched down on the Atlantic shore, prompting thousands to evacuate their homes. To many, particularly in New York, Irene was viewed as a media-manufactured nonevent, but that was largely because the Hurricane Center nailed its forecast. Six years earlier, the National Weather Service also made a nearly perfect forecast of Hurricane Katrina, anticipating its exact landfall almost 60 hours in advance. If public officials hadn’t bungled the evacuation of New Orleans, the death toll might have been remarkably low.

    I like the bit later in the article that describes the number crunching machine and how humans are involved in the analysis. The National Weather Service has heavy-duty computing power to process data coming from weather stations across the country, but the computer is still bad at doing a lot of things.

    To most people, statistics means plugging numbers into an advanced calculator that spits out values, without much thought involved. Those people don’t work with data.

  • The elections season is in full swing, and the New York Times graphics department ramps up its election coverage. With newly hired Mike Bostock teamed up with the Times’ interaction guy, Shan Carter, I’m sure we’re in for some interesting work.

    The two, along with Matthew Ericson, covered the words used at the Republican and Democratic Conventions, but yesterday they put up an interactive that shows the words used at both conventions.

    Each bubble represents a word, and the bigger the bubble the more often it was used. The blue and red split compares word usage of Democrats and Republicans, respectively, and bubbles are arranged horizontally left to right, from words favored by Democrats to those favored by Republicans. For example, “forward” is far to the left, and “fail” is far to the right.

    While the visual provides a sense of what was talked about, the best part is that the visualization is an interface into the transcripts. When you click on a word, quotes that use that word are shown, so you can see what was actually said alongside keywords. Plus, you can enter your own word or phrase, and a new bubble is placed accordingly with the corresponding text on the bottom.

  • The 8-inch cube RGB Colorspace Atlas by artist Tauba Auerbach shows every color in said colorspace. Cubic rainbow. What does it mean? [Colossal via @periscopic]

  • Remember photographer Noah Kalina? He took a picture of himself every day for six years and made a time-lapse video with the photos. The Simpsons even did a spoof that showed Homer’s life over a couple of minutes. Kalina’s kept the picture-taking going, and it’s been twelve and a half years now. He made a new video.

    Six years is a long time, but you didn’t see that much change in the first video. In this one, you can start to see the age in his eyes. The forty-year update will be something to see.

    [via kottke]

  • Members Only
    Tutorials  /  ,

    Sometimes these cartograms can distort areas beyond recognition, but they can also provide a better visual representation for a region with a wide range of subregions. At the least, they’re fun to look at.

  • Nancy Lublin, CEO of Do Something, gives a five-minute TED talk on the potential in analyzing text messages. During a texting campaign, Do Something started to receive texts from troubled teenagers, that ranged from bullying to rape, which led to the organization’s work in setting up a texting hotline. Lublin hopes that, once the system is built, the data gathered from these messages can be used as a census of problems, and can perhaps be used in the same way that Target uses data to figure out if women are pregnant — but to save lives, instead of figuring out what coupons to send.

    [Thanks, Tommy]

  • After identifying 129 metropolitan regions that represent 35 percent of the world’s urban population, LSE Cities mapped some of the densest areas with a simple black and white color scheme. The patterns reveal a footprint of where the much of the world’s population lives.

    To get a sense of the spatial dynamics of these city regions, we mapped 12 cases at the same scale with core built-up areas in black and peripheral areas in grey. By comparing the footprint of the world’s largest urban conurbation in Tokyo with Atlanta, our sample’s most land-hungry city region, we see that roughly the same amount of land is occupied by 42 million as by 7.5 million people. Meanwhile, the map of London shows that 14 million people are spread across South-east England.

    In other words, that’s a whole lot of people packed into Tokyo. I wonder what these maps would look like with Tokyo density.