Data exploration banned

Posted to Design  |  Tags: , ,  |  Nathan Yau

Statistician John Tukey, who coined Exploratory Data Analysis, talked a lot about using visualization to find meaning in your data. You don’t always know what you’re looking, so you explore it visually. Etyn Adar, who teaches information visualization at the University of Michigan, makes a good case for banning the phrase in his students’ project proposals.

For all the clever names he created for things (software, bit, cepstrum, quefrency) what’s up with EDA? The name is fundamentally problematic because it’s ambiguous. “Explore” can be both transitive (to seek something) and intransitive (to wander, seeking nothing in particular). Tukey’s book seems emphasize the former — it’s full of unique graphical tools to find certain patterns in the data: distribution types, differences between distributions, outliers, and many other useful statistical patterns. The problem is that students think he meant the latter.

I see this sort of thing in my suggestion box too. Data exploration with visualization is good, but when someone describes their project as an exploration tool, it often means it lacks focus or direction. Instead it looks like generic graphs that don’t answer anything particular and leave all interpretation to the reader.

Favorites

Think Like a Statistician – Without the Math

I call myself a statistician, because, well, I’m a statistics graduate student. However, the most important things I’ve learned are less formal, but have proven extremely useful when working/playing with data.

The Most Unisex Names in US History

Moving on from the most trendy names in US history, let’s look at the most unisex ones. Some names have …

Where People Run in Major Cities

There are many exercise apps that allow you to keep track of your running, riding, and other activities. Record speed, …

The Best Data Visualization Projects of 2014

It’s always tough to pick my favorite visualization projects. Nevertheless, I gave it a go.