Statistical Graphics and Information Visualization
The two differ in who uses them, how they are used, and who consumes them. They have the same goal. It's to better understand data. You'd think that common bond would draw statisticians and information visualization researchers together for ample collaboration, but that isn't the case. You see, each group doesn't quite understand what the other is doing, and that's where intermingling gets tricky.
In the most recent Statistical Computing and Graphics newsletter [pdf], two short articles — one from a computer science point of view and the other from statistics — contrast statistical graphics and information visualization, respectively.
In the former, Robert Kosara argues the usefulness of InfoVis, namely it's not just pretty pictures and static graphics. InfoVis promotes exploration:
And yet, visualization is much, much more than what it appears to be at first glance. The real power of visualization goes beyond visual representation and basic perception. Real visualization means interaction, analysis, and a human in the loop who gains insight. Real visualization is a dynamic process, not a static image. Real visualization does not puzzle, it informs.
In the latter, Andrew Gelman and Antony Unwin argue the benefits of traditional statistical graphics:
In statistical graphics we aim for transparency, to display the data points (or derived quantities such as parameter estimates and standard errors) as directly as possible without decoration or embellishment. As indicated by our remarks above, we tend to think of a graph as an improved version of a table. The good thing about this approach is it keeps us close to the data.
Wait. Those sound kind of similar. Both articles, written independently of the other, discuss different approaches to visualizing data, but they have similar sentiments.
Oh, but the difference. There has to be a difference.
Kosara uses a spiral example (above) as interaction with data. It shows periodicity.
You can try an interactive version here. I'm still on the fence on the spiral's usefulness, but it has its merits.
Gelman, despite always starting and ending his critiques with a desire to collaborate and learn, said it demonstrates the "Chris Rock effect: a pleasurable intellectual effort spent in discovering something obvious that could’ve been noticed (and even quantified) much more easily and directly via a simple dot and line plot."
However, as stat researcher Chris Volinsky notes:
The top graphic is really quite nice. (Disclaimer: colleagues of mine at AT&T worked on this but I actually do like it). The fact that calling patterns follow state boundaries in some places but not others is quite interesting and unexpected.
Chris Rock is hilarious, but in this sort of discussion, there's no way to take that but badly. Kosara responded:
That is clearly not what information visualization is about. The problem is not that Gelman misrepresents infovis on purpose, he simply has a skewed picture of what it is.
Truth be told.
This is true of most statisticians I've met and is obvious in Gelman's focus on infovis and aesthetics in follow-up posts. I think he sees the bulk of infovis as beautifying graphics, making data stories more colorful, and drawing in readers. Gelman clumps infographics that hit the front page of Reddit or go viral on Facebook (such as this) with serious information visualization (such as this). However, Kosara isn't a fan of the former either. It's why he organized (and I tagged along) a workshop at VisWeek to encourage visualization researchers to publish their work online. On FlowingData, sometimes I post graphics just because they amuse me, and other times I post them because they're really good work.
From the research side, infovis is about perception, finding what visualization methods work best, and how to make large datasets more approachable and easier to explore.
From the application side, you don't have to look farther than The New York Times. Their graphics and interactives are nice to look that, but the beauty is just a side effect of thoughtful research, design, and journalism.
On the flip side, infovis researchers also have a skewed picture of what statistics is. Most statisticians' work is not seen. It's in models, R scripts, more models, and analytical reports. So graphically speaking, an outsider looking in will see a lot of raw plots generated in R. They were useful to the one who made them, but not to a general audience, and the graphics most likely supplemented a more rigorous analysis. Statisticians like to quantify things more than they like to visualize them.
So again, while statisticians and infovis researchers tackle the same problems, they approach these problems very differently. They're looking for the same trends, patterns, outliers, and correlations, but explanations and representations often don't sound or look the same.
To work together, the two have to speak the other's language, and yes, we can all stand to learn a thing or two from the other. Not just making things pretty, but more usable and interactive; and not just hypothesis testing and regression, but a more analytically rigorous approach to data. From a non-academic, in-practice perspective, statistical graphics and information visualization actually aren't all that different. Getting along shouldn't be this hard.