Important Data – Please Act Responsibily

drunk
Photo by nyki_m

Data visualization and infographics come in many forms. Some are comical and purely made for entertainment. Others are made for decisions, and important decisions at that. Let’s focus on the latter right now.

To make educated decisions based on graphics, you need accurate ones, and to make accurate graphics, you need a full understanding of the data.

If you don’t know about the data – the context of where it came from or how it was collected – your visualization or infographic is simply a data comic that could potentially misinform its readers.

An Example

You’ve probably seen Al Gore’s documentary on global warming, An Inconvenient Truth. Do carbon emissions from human actions have an effect on global temperature? There’s a lot of scientific evidence that points to yes, but there’s serious debate in Australia right now, led by Australian senator Steven Fielding, against that argument.

Fielding argues that the data show no evidence that human-made carbon dioxide emissions have an effect on solar radiation, and he’s flaunting this graph as his case-in-point:

Global Temperature Chart

We see surface air temperature and carbon dioxide concentration from 1995 to present. Carbon dioxide doesn’t seem to be matching up with temperature. What’s going on? Fielding has met with several major climate organizations asking that very question.

Graham Dawson does a good job at summarizing the governmental responses. In short, global temperature is only one measurement of climate change. The environmental models for global climate is of course far more complex e.g. ocean and atmosphere. I mean, we’re talking about an entire planet here.

However, despite the responses from high-up organizations, Fielding plays off complexity as ambiguity and denial and he continues to dwell on the single graph as the tell-all.

Stephen Few interprets Fielding’s stance on global warming:

This is a case of someone who listens only to what he wants to hear (the arguments of a few fringe organizations with agendas) and either ignores or is incapable of understanding the overwhelming weight of scientific evidence. He selected a tiny piece of data (a short period of time, with only one of many measures of temperature), misinterpreted it, and ignored the vast collection of data that contradicts his position. This fellow is either incredibly stupid or a very bad man.

Now, I’m not going to pretend that I know all there is to know about global warming, but simply by reading Fielding’s statement, you certainly do get the feeling that someone is a bit diluted.

The problem is that many people believe Fielding whole-heartedly and are basing their decision on a single graph that tells an incomplete story. Where’s the responsibility?

Know Thy Data

The lesson here isn’t about global warming. It’s that you shouldn’t take data lightly. When you’re dealing with data, you have to look past the numbers.

We’ve been taught that numbers mean hard facts. Numbers don’t lie. But they can. People do it every day, unintentionally and oftentimes on purpose to serve an agenda. Don’t be one of those people or let one of them fool you.

As Steve Duenes, graphics director of The New York Times puts it:

The graphic’s mission is determined by the data in the same way that story is written based on information the reporter has gathered… If you don’t find interesting or complete information, no amount of design virtuosity will make up for that.

Always question the data. Design around the data instead of shaping data to a design. Your visualization will be the better for it and so will your readers.

Sources: Graham Dawson, The Punch, Information Overload

[Thanks Tim & Stephen]

FlowingData Delivered to Your Inbox

Weekly Digest

11 Comments

  • I’m not sure that numbers don’t lie, but charts do not speak the truth.

    any representation of numbers is a subjective interpretation and cannot be presented as absolute truth, no matter how much we like it.

    Whether Fielding is right or even convincing can be open to debate, but technically speaking his chart is correct.

  • Great stuff Nathan – I couldn’t agree more. As analysts, our responsibility is to tell the truth, the whole truth, and nothing but the truth.

    Numbers may not lie, but they are always influenced by the questions we ask to obtain them. Furthermore, we can choose to interpret those numbers howsoever we please.

    This is why context is so important. To Jerome’s point (above), the accuracy of the numbers is not the issue; if those numbers only focus on one perspective, we’re not telling the whole truth.

    And in such a scenario, those numbers are reduced to propaganda.

    Some more thoughts on that here: http://eskimon.wordpress.com/2009/07/20/propagandata/

  • Good message. I couldn’t agree more. Getting accurate and valid data is one thing (difficult enough on it’s own), presenting them is another. And last, making conclusions is again another thing.

    If you look at the graph in your example about CO2 and temperature, there are so many questions unanswered:
    – what are the statistics behind this graph? (error levels of the data points, regression, etc). Without the statistic you can say absolutely nothing about what is happening. You might see a “trend” of rising temperature, but you cannot know that for sure. Maybe the variance is so large that statistically there is no trend in both measures at all.
    – even if you would have the statistics behind the data, you have to know if a 15 year period is relevant to what you want to say with the data. 15 years is absolutely way too short of a period to say anything about climate change. I mean, yesterday it was 26 degrees here, today 20. Does that mean the climate is getting colder?

    More in general: on one side I think it’s good more and more data is being opened to the “general” public. And it’s good that that data is available and can be used in various ways. However, having a background in science, I know how incredible difficult it is to deal with that data and use it to say anything about relations between things. Causal relations is even more difficult. So there’s a big danger in data being misused, misinterpreted, etc. Your example is one of many.

  • I agree with eskimon that the context needs to clearly explained so the rational and/or assumptions behind the visualization are disclosed.
    The above graph is nearly as bad as the famous hockey stick graph which has since been shown to be statistically incorrect.
    http://www.climateaudit.org/?page_id=354
    People should be aware of how the Earth’s temperature fluctuates over large timescales, the current warming trend since the beginning of the 20th century, the cooling trend preceding it and the periods in the past where the Earth warmed for reasons which are not understood.
    A fair showing of the data should result in concern for our future climate but also realization that there are still significant gaps in our understanding of the Earth’s climate.

  • jeff weir July 20, 2009 at 3:52 pm

    Fantastic post, Nathan. I can’t help reposting some comments I posted at Chandoo’s blog several months back:

    Any correlation we can infer from a line on a chart is only correct IF the model described by the chart is correct. No chart tells us whether we should have used the data as is, or graphed the log of one or the other series, or whether we should have used a linear trend line or a polynomial one. We have to use logic to construct the model, then use a regression to test the hypothesis whether our model is correctly described or not.

    Most people who read charts are not aware of these subtleties. If you place the line somewhere on the chart, you’re telling these peopre that ‘these things are correlated, and here’s how’. Place the line somewhere else, and you’re telling a slightly different ( or perhaps completely different) story. Omit the line and you’re telling yet another story …one that says ‘I don’t think these 2 things have ANY correlation.

    Use a linear trendline, and get one truth. Use a polynomial trendline and get another truth. Omit one variable and choosle another, and get another truth. When it comes to 2 dimensional charts, truth (or an awareness of a lack of truth) lies in the eye of the beholder, which is strongly influenced by the decisions of the chart creator.

  • ‘Man-made’ global warming is the biggest crock of crap that the counter-culture has tried to get over on the voting consumer.

  • “This fellow is either incredibly stupid or a very bad man.”

    Speaking from Australia I can confirm it is the former.

  • The temperature axis shows ANOMALY. The amount the average air temperature is “wrong by”. This is greater than zero for the entire chart, although not increasing linearly.

  • AGW proponents have, until recently, been using the exact same chart but over an earlier time range (1980 to 1998 or therebaouts) to “prove” the CO2 – warming link. None of IPCC’s projections have allowed for temporary cooling in the foreseeable future.

    Regardless of what Fielding’s personal agenda may or may not be, the important point to take away from the data is that IPCC’s models and forecasts may not be accurate. That is not to say that there has not been any global warming over the last century, and not to say that there won’t be more over the next. The point is that, if the projections are not being borne out by the data over time, then the wild feedback effects that the IPCC postulates to support their cataclysmic global warming predictions may also not be accurate.

    CO2 may cause some global warming, but assuming a case of zero feedback, the level of warming purely due to CO2 would be modest (somewhere close to 1 degree C). So the important question is “how strong will the feedbacks be, and will they be positive or negative”? I think that question is far from settled, and the current temperature trend supports the idea that there is more work to do on this issue.