Jeffrey Heer et al. writes in Design Considerations for Collaborative Visual Analytics about a couple of models for social visualization — information visualization reference model and the sensemaking model. The former is a simpler, more straightforward model starting with raw data -> processed data -> visual structures -> actual visualization; while the latter is a bit more complicated with similar stages but with feedback loops. My main reflections weren’t so much with the ideas proposed by the paper. Rather, I’m more interested in what was not mentioned — not only in this paper but in other social data analysis papers.
Social data analysis so far has seemingly stayed inside the visualization bubble (for the most part) with little talk of statistics or traditional analysis. Don’t get me wrong. I love data visualization and am all for “harnessing the power of human cognition,” but I think some quantitative analysis needs to be in the loop. As most of the SDA models are now, analysis starts with the visualization; a group of people interpret the viz; and then somehow they come to some kind of consensus, maybe.
Think Like a Statistician
What if we started with some visualization, then some stat, back to viz, so on and so forth? I’m just thinking of how I would approach a large dataset with a group of stat people. It’s almost always exploratory data analysis (EDA) first, find something interesting in the viz, run some analysis, go back to the viz, so on and so forth.
With powerful data visualization coupled with statistics, there’s definitely something there — especially when it’s all wrapped up in the ideas of socialized data. Maybe? At the very least, we can put Statistics at the end of the flow chart for some kind of validation of the group’s findings. Visualization and EDA, as it is now, can only give us a certain level of results with a limited amount of reliability.