Data cleaning tips

Posted to Statistics  |  Tags:  |  Nathan Yau

When you first learn statistics, visualization, or any data-related subject, the data usually is given to you in a ready-to-use format. This is so that you can spend most of your time on the topic of interest. But once you step outside the learning bubble, data rarely comes in the format you want.

Marc Bellemare, an associate professor in the Department of Applied Economics at the University of Minnesota, provides some practical tips on how to deal with this. Bellemare’s parting advice:

Really, there is no big secret to cleaning data other than “Document everything” and to save everything in different files and in different locations (i.e., your computer, Dropbox, Google Drive), and there is no other way to learn data cleaning than by doing it.

Yep.

Some of the tips are in the context of specific software environment, but you can easily apply them to more general situations.

Favorites

How We Spend Our Money, a Breakdown

We know spending changes when you have more money. Here’s by how much.

Jobs Charted by State and Salary

Jobs and pay can vary a lot depending on where you live, based on 2013 data from the Bureau of Labor Statistics. Here’s an interactive to look.

Where Bars Outnumber Grocery Stores

A closer look at the age old question of where there are more bars than grocery stores, and vice versa.

Famous Movie Quotes as Charts

In celebration of their 100-year anniversary, the American Film Institute selected the 100 most memorable quotes from American cinema, and …