• September 18, 2018


    Statistics  /  , ,

    When you try to describe the size of something but don’t have an exact measurement, you probably compare it to an everyday object that others can relate to. Using the Google Books Ngram dataset, Colin Morris looked for how such comparisons changed over the past few centuries.

    I especially like the bits of history to explain why some words fell into and out of fashion.

  • September 17, 2018

    There are endangered species where the remaining few in the world could fit on a single car train. Mona Chalabi for The Guardian imagined such a scenario.

    Usually when we talk about scale and putting numbers into perspective, it’s about imagining the large ones. What does a million look like? A billion? Chalabi’s illustrations take it the other direction.

  • September 17, 2018

    Typhoon Mangkhut went through the northern end of the Phillipines a few days ago. At least 25 people died. The New York Times provides a scrolling 3-dimensional view using data collected by NASA satellites.

  • September 14, 2018


    Site News  / 

    I talked with Moritz and Enrico on Data Stories, my favorite visualization podcast. They’ve been providing a healthy balance of practice and research since 2012.

    I don’t dare listen to myself, but based on the show notes we talked about FlowingData over the years, some of the changes in visualization, and answered listener questions. You can listen here.

  • September 13, 2018

    The Weather Channel is using a realistic 3-D depiction surrounding a reporter to show what a storm surge might bring. Here, just watch it:

  • September 13, 2018

    Waffle House activated their storm center in preparation for Hurricane Florence. Their restaurants are open 24/7, so they need to keep track of which ones need to close or limit their menus. This might also have to do with an informal Waffle House Index that FEMA described last year:

    If a Waffle House can serve a full menu, they’ve likely got power (or are running on a generator). A limited menu means an area may not have running water or electricity, but there’s gas for the stove to make bacon, eggs, and coffee: exactly what hungry, weary people need.

    It’s more than just a Waffle House though.

    Businesses in communities are often some of the biggest drivers of recovery. If stores can open, people can go back to work. If people can go back to work, they can return to at least one piece of a normal life—and that little piece of normalcy can make a big difference.

    Hold up. I think I got it. If we just keep all the businesses open, we can avoid all disaster. That’s how causation-correlation works, right? Nailed it.

    (Stay safe, Carolinians.)

  • September 13, 2018

    Brian House collected polluted water with acid mine drainage in the Tshimologong Precinct, Johannesburg and translated pollution levels to sound:

    Acid Love comprises vessels of AMD gathered from a mine on the outskirts of the city. These are connected in an electrical circuit that measures the conductivity from the metals of the water and coverts it into sound. The sound is further modulated by data gathered from remediation efforts at the mine. The installation itself also performs a remediation process—over time the metals will precipitate to the bottom of the vessels, and both the sound and the color of the water will change as it is purified.

    [via @blprnt]

  • Members Only
    September 13, 2018


    The Process  /  ,
    Google released Dataset Search to the world last week. Some asked for my thoughts on the new tool, and as you know, ask and you shall receive. Plus, finding, gathering, and curating data is often the most tedious and time-consuming part of a visualization project. So anything to speed up the collection process is worth a look.
  • September 12, 2018


    Maps  /  ,

    Hurricane Florence is forecast to touch down Thursday night or Friday, and what’s become the norm, there are several ways to see where the hurricane is and where it might go. Here are a handful of views. Each focuses on different aspects of potential storm.
    Read More

  • September 12, 2018

    Wikipedia is human-edited, so naturally there are biases towards certain groups of people. Primer, an artificial intelligence startup, is working on a system that looks for people who should have an article. It’s called Quicksilver.

    We trained Quicksilver’s models on 30,000 English Wikipedia articles about scientists, their Wikidata entries, and over 3 million sentences from news documents describing them and their work. Then we fed in the names and affiliations of 200,000 authors of scientific papers.

    In the morning we found 40,000 people missing from Wikipedia who have a similar distribution of news coverage as those who do have articles. Quicksilver doubled the number of scientists potentially eligible for a Wikipedia article overnight.

    Then, after it finds people, it generates sample articles to get things started.

  • September 11, 2018


    Design  /  , ,

    I’m always down for faux vintage, online recreations of actual vintage visualization-related things. Using scans from the real thing, Nicholas Rougeux recreated Werner’s Nomenclature of Colours, supplementing with interaction and photo references.

  • September 10, 2018


    Maps  /  ,

    You’ve probably seen the maps of Earth at night. It gives you a good idea of activity around the world, through the eyes of light. As an experiment and a shift in view, Jacob Wasilkowski mapped the light as terrain.

  • September 7, 2018


    Design  /  ,

    Graham Douglas, a data journalist at The Economist, looks back on the days when getting data and visualizing it was tedious from start to finish:

    But even these seemingly simple charts had their challenges and took a lot of time to make. Data were found in books by a research department skilled in the art of extracting obscure economic figures and statistics, which were copied to scraps of paper. We would use rulers, dividers, protractors and geometry (Thales’s theorem) to divide axis lines into equal parts to draw the scale ticks. We would plot the data manually in pencil on a special drawing board and sketch out the wording and title for approval before we inked the whole thing in. Text was added last using stencilling, or later, Letraset dry-transfer lettering. Making a spelling mistake was distressing. Areas were filled with sticky-back plastic pre-printed film cut out with a scalpel.

    Maybe grabbing data out of PDF files isn’t so bad.

    No. Still horrible.

    This reminds me of my dad’s work though. He’s a retired civil engineer. When I was young, he brought home these giant blueprints. He’d roll them out after dinner, and armed with a protractor, a scaled ruler, and a calculator I could never figure out, he’d mark up building plans. Towards the end of his career, he kept everything on a flash drive.

  • September 6, 2018

    In a collaboration with Siena College, The Upshot is showing live polling results. The ticker moves in real-time for every phone call.

    For the first time, we’ll publish our poll results and display them in real time, from start to finish, respondent by respondent. No media organization has ever tried something like this, and we hope to set a new standard of transparency. You’ll see the poll results at the same time we do. You’ll see our exact assumptions about who will turn out, where we’re calling and whether someone is picking up. You’ll see what the results might have been had we made different choices.


  • Members Only
    September 6, 2018
    Visualization as template-filling content is lazy visualization that no one draws benefit from. Give people a reason to care.
  • September 6, 2018


    Data Sources  /  ,

    Datasets are scattered across the web, tucked into cobwebbed corners where nobody can find them. Google Dataset Search aims to make the process easier:

    Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset.

    I’m always a little wary of dataset search engines. They never seem to live up to their promises, because they always require that those with the data do a little bit of work, such as publish metadata that makes indexing easier. But this is Google. I’ll have to give it a go the next time a curiosity pops in.

  • September 5, 2018

    Sports visualization and analysis tends to focus on gameplay — where the players are, where the ball goes, etc. In Reimagine the Game, the focus in on crowd noise through the course of a game. Pick a game and see the waves of noise oscillate through the arena during significant events.

    It’s an advertisement feature on The Economist, which is kind of interesting, but it’s still fun to watch the games play out.

  • September 4, 2018

    It’s getting hotter around the world. The New York Times zooms in on your hometown to show the average number of “very hot days” (at least 90 degrees) since you were born and then the projected count over the next decades. Then you zoom out to see how that relates to the rest of the world.

    I’ve always found it interesting that visualization and analysis are typically “overview first, then details on demand”, whereas storytelling more often goes the opposite direction. Focus on an individual data point first and then zoom out after.

  • August 31, 2018

    Post-game sports interviews tend to sound similar. And when you do say something out of pattern, the talk shows and the social media examine every word to find hidden meaning. It’s no wonder athletes talk in cliches. The Washington Post, using natural language processing, counted the phrases and idioms that baseball players use.

    We grouped phrases that were variations of each other together (within a one- or two-word difference) into a list of roughly 20,000 possible cliches. Then came the subjective part. From that list, we chose the ones that were the most interesting, then grouped those with similar meanings. And voila — the phrases we considered to be the cream of the cliche crop.

    I can’t decide if the word cloud to open the article is a fun hook or a distraction. I’m learning towards the former, but I think it would’ve been less the latter without the interaction.

  • August 30, 2018


    Design  /  ,

    When the web was relatively new, things were more of a free-for-all. Everything was an experiment, and it always felt like there were fewer consequences online, because not that many people really used the internet. Now a large portion of people’s lives are online. There is more at stake.

    Tactical Tech focuses in on the (careless) design of systems that allows bad actors to thrive:

    Design can also be weaponised through team apathy or inertia, where user feedback is ignored or invalidated by an arrogant, culturally homogenous or inexperienced team designing a platform. This is a notable criticism of Twitter’s product team, whose perceived lack of design-led response is seen as a core factor for enabling targeted, serious harassment of women by #Gamergate, from at least 2014 to present day.

    Finally, design can be directly weaponised by the design team itself. Examples of this include Facebook’s designers conducting secret and non-consensual experiments on voter behaviour in 2012–2016, and emotional states of users in 2012, and Target, who in 2014 through surveillance ad tech and careful communications design, informed a father of his daughter’s unannounced pregnancy. In these examples, designers collaborate with other teams within an organisation, facilitating problematic outcomes whose impact scale exponentially in correlation with the quality of the design input.