Members Only

Cleaning and Formatting Data, What I Use (The Process #64)

Oftentimes, I find myself wondering what is the fastest and most efficient way to process a dataset. If it takes too long for me to think of an answer, then usually it’s better to just manually do it. Put on the headphones and just start punching in values into a spreadsheet. It almost always takes less time than I thought it would.

But of course, there are many tools to clean up your data, and they can be helpful with the right dataset and situation. I tend to stick to a small handful. Here’s what works for me.

To access this issue of The Process, you must be a member. (If you are already a member, log in here.)

Become a Member

The Process is a weekly newsletter on how visualization tools, rules, and guidelines work in practice. I publish every Thursday. Get it in your inbox or read it on FlowingData.

You also gain unlimited access to hundreds of hours worth of step-by-step visualization courses and tutorials, which will help you make sense of data for insight and presentation. Resources include source code and datasets so that you can more easily apply what you learn in your own work.

Your support keeps the rest of FlowingData open and assures the data keeps flowing freely.

Cleaning and Formatting Data, What I Use (The Process #64)

Topic

Second Edition

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics (2nd Edition)