Think Like a Statistician – Without the Math

I call myself a statistician, because, well, I’m a statistics graduate student. However, ask me specific questions about hypothesis tests or required sampling size, and my answer probably won’t be very good.

The other day I was trying to think of the last time I did an actual hypothesis test or formal analysis. I couldn’t remember. I actually had to dig up old course listings to figure out when it was. It was four years ago during my first year of graduate school. I did well in those courses, and I’m confident I could do that stuff with a quick refresher, but it’s a no go off the cuff. It’s just not something I do regularly.

Instead, the most important things I’ve learned are less formal, but have proven extremely useful when working/playing with data. Here they are in no particular order.

Attention to Detail

Oftentimes it’s the little things that end up being the most important. There was this one time in class when my professor put up a graph on the projector. It was a bunch of data points with a smooth fitted line. He asked what we saw. Well, there was an increase in the beginning, a leveling off in the middle, and then another increase. However, what I missed was the little blip in the curve in the first increase. That was what we were after.

The point is that trends and patterns are important, but so are outliers, missing data points, and inconsistencies.

See the Big Picture

With that said, it’s important not to get too caught up with individual data points or a tiny section in a really big dataset. We saw this in the recent recovery graph. Like some pointed out, if we took a step back and looked at a larger time frame, the Obama/Bush contrast doesn’t look so shocking.

No Agendas

This should go without saying, but approach data as objectively as possible. I’m not saying you shouldn’t have a hunch about what you’re looking for, but don’t let your preconceived ideas influence the results. Because if you go to length looking for some specific pattern, you’re probably going to find it. It’ll just be at the sacrifice of accurate results.

Look Outside the Data

Context, context, context. Sometimes this will come in the form of metadata. Other times it’ll come from more data.

The more you know about how the data was collected, where it came from, when it happened, and what was going on at the time, the more informative your results and the more confident you can be about your findings.

Ask Why

Finally, and this is the most important thing I’ve learned, always ask why. When you see a blip in a graph, you should wonder why it’s there. If you find some correlation, you should think about whether or not it makes any sense. If it does make sense, then cool, but if not, dig deeper. Numbers are great, but you have to remember that when humans are involved, errors are always a possibility.

*Photo by misterbisson

57 Comments

Become a member. Support an independent site. Make great charts.

See What You Get

Favorites

The Best Data Visualization Projects of 2014

It’s always tough to pick my favorite visualization projects. Nevertheless, I gave it a go.

Top Brewery Road Trip, Routed Algorithmically

There are a lot of great craft breweries in the United States, but there is only so much time. This is the computed best way to get to the top rated breweries and how to maximize the beer tasting experience. Every journey begins with a single sip.

Who We Spend Time with as We Get Older

In high school, we spend most of our days with friends and immediate family. But then we get jobs, start a family, retire, and there’s a shift in who we spend our days with.

19 Maps That Will Blow Your Mind and Change the Way You See the World. Top All-time. You Won’t Believe Your Eyes. Watch.

Many lists of maps promise to change the way you see the world, but this one actually does.