Martin briefly discusses a presentation at a recent visualization workshop. The speaker blurts, “I don’t care about the data, I am just interested in the method.” This begs the question
Can you design worthwhile visualization without worthwhile data?
I can see why the speaker said what he did, but you know what, if you don’t care about the data then I probably won’t either, and most likely, I won’t care about your visualization. What do you think? Can useful visualization techniques come out of using whatever datasets?
I asked the same question on Twitter a couple of days ago. Here are a few of the responses:
@EagerEyes: No.
@skylark64: you can, but shouldn’t… Then again, maybe it is worthwhile to someone.
@couch: No.
@vrypan: But that’s the question in the first place! “what’s my data worth?” If you know the answer, tools have little importance.
I think I know where this conversation is headed.
Can you design worthwhile visualization without worthwhile data?
– I agree with vrypan this is a chicken and egg type scenario. I think you have to analyze to find out if its worthwhile and visualization is a great way to analyze, so yes.
Can you design worthwhile visualization without inaccurate data?
-The answer here is No, it may look good but if the data is bad then it is worthless.
I think your mixed answers are due to the ambiguous use of the word ‘worthwhile’
As an aside, loving the new site feel Nathan, keep it up!
Matt
I think visualizations can be worthwhile even if the data isn’t, but for reasons other than its message. While that does defeat its intended purpose, it can serve another.
Can you design worthwhile visualization without worthwhile data?
For a generalist/technician like me, it’s hard to come up with a “worthwhile” or “meaningful” data set on a moment’s notice. I am frequently making up data to use as examples for my web site, blog, or training sessions. I use them to illustrate data and charting techniques, and to me they make sense.
Invariably someone with greater knowledge in the domain from which I’ve taken the data will comment on the data itself, or the way I’m slicing through it, or even about the charting approach I’m taking. I welcome these comments as discussion points and as learning experiences.
Sure, it sounds dumb, and maybe it is dumb, but the context here is pretty minimal. Is the speaker’s work focused on methods of visualization across a wide variety of data sets? If I were in that position, and somebody at a conference was going on about the details of my example, and missing the big picture of what I was trying to say, I might blurt out something like that too. I would be reacting to a moment, not expounding my philosophy.
I actually think it is possible to implement a level of quality visualization even if there is no sense of how worthwhile the data is that supports the visualization. You first have to understand human visual perception, leveraging its strengths. If you lack the basic awareness of how people see, then regardless of how worthwhile the data is you risk polluting the visualization out of ignorance. Having good data is certainly helpful, but not necessary.
Since I work with health insurance claims data, the ability to employ good visualization techniques on the data can even reveal when the data is actually poor. In other words, if the visualization reveals that a particular trend or rate for a given procedure or diagnosis is not in line with known national/community standards, then it is quite possible to conclude that the people who are recording the bill codes in the health insurance claims are not using the correct codes. In such a situation, the visualization reveals that there is an underlying problem with the data which proves it to be unreliable (i.e, not worthwhile), and therefore should be excluded from other analyses. Keep in mind that the intent of such a visualization wasn’t to look for problems in the data, but rather the data problems present themselves through the use of good visualization techniques.
To echo other comments, I’d say it depends on the definition of “worthwhile data” and “worthwhile visualization.”
As for the data, if we take “worthwhile” to mean reasonably accurate or at least realistic, then no, visualization without worthwhile data is probably not itself worthwhile. But if by “worthwhile” we mean that the data are of interest or importance, then sure, the visualization might still be worthwhile.
“Worthwhile visualization” depends on its purpose and audience. If the intent is to develop or demonstrate a method, and it is meant for visualization enthusiasts or those in that field, then indeed it’s not really the data that is important. If it’s meant for popular consumption or use by experts in the subject of the data, then the data had better be something worthwhile so that the visualization is more than a pretty picture.
While I don’t mind a visualization pro saying “I don’t care about the data,” what bothers me is that I feel like there is a rash of other people exhibiting this attitude too. It’s my big gripe about the whole Web 2.0 world: I perceive a lot of people getting very excited about finding data and sharing data and visualizing that data, but without caring a whit about whether the data or the visualization thereof is actually good for anything. (Not that I have any good examples of this on hand…)
Sure you can have great, persuasive visualizations with bad data; their called propaganda.
Issues of cardinality and range can significantly influence whether a particular visualization is appropriate for a particular data set, IMHO.
Yes you can. But does it have a purpose, outside of being pretty?
Interesting data visualization is not contingent on interesting data, but good data visualization absolutely requires good data. Without data you end up with just “visualization.” Art. Not that there’s anything uniquely wrong with that. Data visualization that doesn’t “care” about the data will lose its context to the subject and the communication will suffer.
There’s a lesson in packaging, my occupational field, where a designer may have a great idea for a package…a real award-winner. If that package isn’t appropriate for the product, it doesn’t mean the package wasn’t clever or interesting…it just means it wasn’t appropriate. Any manufactured product faces this dilemma; where is the balance between design and usability? Design must be informed by the product’s purpose if it’s going to add value to that product. Otherwise the design will feel gratuitous and sit atop the purpose like an ill-fitting Christmas sweater.
I agree with Peter Whitley. Garbage In = Garbage out.
A good visualization should let you evaluate whether the data is garbage or not. A scatter plot is a way to look for correlation and will indicate what is going on. If I make a panel of pie charts to visualize a correlation then I’m probably in trouble.
Absolutely, yes. A good visualization can pull unseen insights out of apparently useless data, and that happens quite often.
On the flip side, it’s also quite common to see important data sets get rendered into background noise with a crap/lazy visualization.
However, infographics are most potent when their “info” cannot be called into question and is easily independently verified.
Data visualization is meant to maximize comprehension of a data set. If the data set is “not worthwhile” then you comprehend something “not worthwhile”. In this case the visualization is more objet d’art than business driver. Data matters.
Not to be a snarky purist, but nobody creating a visualization should “care” about the data. This is the what should separate a visualization from a sales brochure.
I should’ve expanded a little more on why I just said, “No,” but Twitter makes that difficult…
Design should be considered content the same way data should be considered a part of the visualization. You can design something to look beautiful, but that design will stand as something far greater if the content is meaningful too — same goes for data and visualizations. You can create an incredible visualization, but I’d argue that it’s only as meaningful or as engaging as the data behind it.
That doesn’t mean you can’t design without content or create visualizations without data, but those that have meaningful content/data behind them become that much more rich.
It’s not a chicken-and-egg question. Rather, I think the correct answer depends on what you are trying to do. Surely some visuals will very quickly help address the question “do I have good data?” But perhaps an entirely different visualization technique is better once you get to the point where you want to tell the data’s story in layman’s terms.
Since the eventual goal is probably to present a useful and relevant visual to a target audience, I think that an iterative process should be followed. After receiving the data, just start making graphs. Start with the simple stuff: histograms, scatterplots, etc. At this point you shouldn’t be very concerned about data-ink ratios or exactly which graph to use. You are doing this to become more familiar with the data. You may even decide that it is appropriate to clean up the data or employ some sort of transform. If you do, you should leave a trail for yourself or your colleagues just in case you need to explain yourself later. Eventually you’ll get to the point where you’re ready to make THE graph. Try one. Decide whether that graph tells the real story. Get some opinions from your colleagues, just like Nathan does here on occasion. With practice you’ll come up with the right graph for the data and for your target audience.
Sound without language = music.
Architecture without people = sculpture.
Visualization without data = art.
miked98 – I don’t really agree with your first two lines, but the third strikes a chord. So much of what passes for visualization fits into the category of Art with Colored Arcs.
Nothing seems to be easier than seeing someone whom you can help but not helping.
I suggest we start giving it a try. Give love to the ones that need it.
God will appreciate it.