What Do You Use to Analyze and/or Visualize Data? [POLL RESULTS]

The most recent FlowingData poll asked what you use to analyze and/or visualize data. Thanks to all 347 of you who participated.

I was surprised by the percentage of you who mainly use Microsoft Excel, mostly because last month’s poll showed a near majority of you in computer science, design, and statistics. Although, R did have a strong showing too. Maybe it’s the information scientists and business folks representing for Excel?

23 Comments

  • nice use of a pie chart :-P

  • Tim’s comment beat mine … Doh!

    Your comment doesn’t accept html img tags … Doh!

  • Tim’s comment beat mine … Doh!

    Your comment doesn’t accept html img tags … Doh!

  • what’re you talking about? pie charts rock. i am going to use a pie chart in every post from now on. haha.

  • To a certain extent i use Excel because that’s the form that the people reviewing the data want. It gives them a familiar environment to play with the data and make their own charts.

  • To a certain extent i use Excel because that’s the form that the people reviewing the data want. It gives them a familiar environment to play with the data and make their own charts.

  • 1. What on earth is wrong with a piechart? It is familiar, eleant and easily understood.
    2. Almost everyone who uses R will receive data in csv or tab delited files– and spend a minute in excel looking at the file format and scrolling through the data looking for NAs or special characters. If you are like me you will also save your data then spend a minute in excel making sure the column headers are not 1 space out.

  • or ‘elegant’ even

  • 1. What on earth is wrong with a piechart? It is familiar, eleant and easily understood.
    2. Almost everyone who uses R will receive data in csv or tab delited files– and spend a minute in excel looking at the file format and scrolling through the data looking for NAs or special characters. If you are like me you will also save your data then spend a minute in excel making sure the column headers are not 1 space out.

  • or ‘elegant’ even

  • Pie chart advantages:
    – They are familiar.

    Pie chart disadvantages
    – They rely on human ability to compare areas or angles, which is less reliable than the human ability to compare lengths and distances.
    – They have one purpose, to show parts of a whole, and aren’t very good at that.
    – People underestimate their effectiveness.
    – They are not information-dense.
    – Their poor effectiveness is magnified if multiple pie charts are used together.

    I will refer you to writings by Bill Cleveland, Edward Tufte, Stephen Few, and many others that give strong arguments against the use of these graphs.

  • Pie chart advantages:
    – They are familiar.

    Pie chart disadvantages
    – They rely on human ability to compare areas or angles, which is less reliable than the human ability to compare lengths and distances.
    – They have one purpose, to show parts of a whole, and aren’t very good at that.
    – People underestimate their effectiveness.
    – They are not information-dense.
    – Their poor effectiveness is magnified if multiple pie charts are used together.

    I will refer you to writings by Bill Cleveland, Edward Tufte, Stephen Few, and many others that give strong arguments against the use of these graphs.

  • Actually the arguments about pie charts are more subtle than you make out. Stephen Few gives a nuanced case for “con” while Spence has a good case for “pro.” On the other hand, Tufte’s arguments are less persuasive (in one of his books he simply calls pie charts “dumb” and leaves at that). Interestingly, the controversy goes back about 80 years, and in that time there have been many scientific experiments with no definitive findings–which should be a sign that the situation is more complicated than people think.

    An argument like “it’s harder to estimate areas than lengths” oversimplifies matters. A good pie chart simultaneously encodes data with angles, areas, and arc length. Unless you can cite a study that specifically looks at this redundant encoding, you’re not advancing the debate. Furthermore, it’s not even clear that precise communication of data is the most important issue. (After all, a table is the most precise thing you can have, but everyone still likes bar charts).

    Statements like “They have one purpose: to show parts of a whole, and aren’t very good at that” are easy to make but hard to back up. Show me the study that supports this statement: if you read the Few and Spence articles, you’ll see that in fact the literature is thoroughly mixed.

    As for the argument, “people underestimate their effectiveness” I would agree :-)

    The point about information density is bizarre. There’s only a certain amount of information to be had here: do you want Nathan to make up more numbers or something? Or do you want him to shrink the chart? The concept of information density is not meant to be taken literally–otherwise we’d have the absurd idea that this chart would be better as a 30×30 icon. Nathan’s chart is fine as is: big, clear, eye-catching, easy to read.

  • At least it isn’t a 3-D pie… ;)

  • I think Nathan shows a good example of using pie chart with good color here.

  • Andrew Pratley June 11, 2008 at 2:48 am

    Surely the question to ask is “What program did you analyse the results in Nathan?” … :-)

    P.S. Props to your comments being in the bright pink box. Weren’t people taking notice of you…?

  • @Jen – You’ve presented a detailed and well-considered argument. I have a few counters.

    Re 80 years of “no definitive findings”: I would think that if pie charts were effective, there would have been a definitive judgment or even consensus in their favor, and we would not be having this debate.

    Angles, areas, and arc length are less effective at showing values than linear dimensions and positions (see Cleveland and more). The fact that pie charts use all of these does not seem to help the argument.

    “As for the argument, “people underestimate their effectiveness” I would agree :-)”

    Unfortunately, I meant to say ‘OVERestimate”. This is a consequence of their omnipresence and overuse.

    Re data density, the bar chart I offered is 40% the size of Nathan’s pie, and offers twice the raw data values. In this case it is not an important point: Nathan used a large chart for emphasis. However, in addition to their low data density, pie charts can only handle a low data complexity: one series, effective maximum of around a half dozen points. A line/column chart can convey multiple series and even multiple dimensions, not through 3D effects but through multiple series and compound category axis layout.

  • Jen and Jon – Both of you, really GREAT comments. Some of the best here on FlowingData. I like that this pie chart debate happens to be going on in a “what you use to visualize data” post.

    My thoughts:

    Like any chart/graph, it comes down to what you’re using it for. If I had, say, 20 values, then no, I wouldn’t use a pie chart, but that’s a given.

    If I want people to read _exact_ values, then I use a bar chart or something else that’s linear. However, it’s usually the case that we’re after a general idea – a big picture – which pie charts do just fine with, and not nearly as poor as some would make it out to be.

    What if I removed the numbers from the above graphic? We’d still know that Excel and R dominate, the rest make up near half, and that Actionscript, Processing, and SAS had good showings. After all, when I cut a slice of literal cherry pie for myself and one that’s twice as big for my friend billy bob over there, I can tell the bigger slice is… bigger. The difference between 6% and 7% doesn’t matter so much.

    So what was the purpose of the above graphic? Fun of course :)

  • Andrew – oh, you know me, always crying for attention – love me, pay attention to me, worship me.

  • Andrew Pratley June 12, 2008 at 5:00 am

    @ Nathan:

    1. Done
    2. EFT or Paypal? (Oh, you said pay *attention*) Whups.
    3. Pimping is an ongoing process.

    I also was surprised to find that it was the simple pie chart that resulted in the most interesting discussion.

  • I’d be interested in the results for Stata. Do you think it would make sense adding this as response option?

  • Regarding the field breakdown in the previous poll: I have seen quite a few ACM Computer Science papers in web mining that use Excel. Actually, I think the ACM template comes in Word format as well as LaTeX. That leads me to believe CS is still pretty happy with MS Office…for papers at least. R seems to be creeping in and within a couple of years should become standard with them.