Are you looking to get into data visualization, but don’t quite know where to begin?
With all of the available tools to help you visualize data, it can be confusing where to start. The good news is, well, that there are a lot of (free) available tools out there to help you get started. It’s just a matter of deciding which one suits you best. This is a guide to help you figure that out.
But before we get into what you should use, a couple of questions.
What data are you looking at?
Hopefully you already have a dataset that you’re interested in. If not, go find one. It’s important to have actual data when you’re learning, because the visualization tool that you use will depend on it.
There are lots of places on the Web to find data. Here are a few worth checking out:
The above is a very small subset of what’s available. Oh, and let’s not forget all the government organizations that have departments dedicated to putting together datasets. Just pick one you’re interested in.
Got your data? Ok, good, on to the next step.
What’s the purpose of your visualization?
The next step is to figure out you’re trying to do with your visualization. Are you working on a Web application that has some graphs? Is it an interactive tool? Do you want to use better-looking graphs in your slide presentation? Is the visualization for a publication? Do you just need it for analysis?
Again, what you decide here will affect what tool you should use.
What Visualization Software to Use
Now that you have the answers to those two questions in mind, we can make a decision on what will work best for you.
This means graphics like what you see in the newspaper. Most people use Adobe Illustrator. It gives you control over all the elements in your graphic – color, stroke, font, orientation, etc.
If you want to do something more complicated than your traditional graphs, you can design it by hand in Illustrator or your can do it in R (either programmatically or with one of the add-on libraries), which is a software environment for statistical computing and graphics. From R, you can import your file as a PDF into Illustrator. That’s usually what I do.
Illustrator is kind of pricey however. Some have suggested using the open-source alternative Inkscape. I’ve never tried it though.
Example: The New York Times
Many want to add some spice to their presentations. You can use the same software as the above, but there’s also not much harm in using Microsoft Excel despite the stigma. The key here is not to use the default settings. You can actually do a lot in Microsoft Excel and make it look good. Plus, you don’t need to include many details in a graphic made for presentation slides, because people can’t see them from far away.
Personally, I don’t use it much for graphics since I’m comfortable with R and Illustrator.
There are a lot of analysis tools, and the preferred one will change depend on who you ask. I use R, which requires some programming skills. Most people use Excel. I’ve also heard a lot of good things about Tableau Software.
For Web Applications
I’m going to assume you have a programming background if you’re looking to do visualization for a Web application. If you don’t know anything about computer code, you can try Many Eyes or Fusion Charts. You’ll be limited to their offerings though.
Now, if you’re developing for the Web, there are two main options here. The first is Processing, which was designed to make coding easier and to give you more bang for the buck. Check out the site and Processing forums for plenty of tutorials and tips. The end result is a Java applet.
The second, more popular option is Flash. You can either do stuff in the actual Flash program, or you can use Actionscript for a pure coding solution. Either way, the end result is something that runs in the Flash environment. The Flare visualization toolkit is a good place to start.
The upside of Flash is that it tends to load faster than Java, and more people have Flash than Java installed on their computer. You might also be able to get away with just a little bit of code if you use just Flash, although, if you really want to get serious with visualization, you’ll need to learn Actionscript.
To that end, Processing is a lot easier to learn coding-wise. Plus it’s free and open source.
Processing definitely seems to be the software of choice for artists and designers. Again, it goes back to how easy it is to learn and how much you can do with it. Illustrator is the most common choice for non-interactive graphics since it gives you drag-and-drop control over all the elements.
Example: Processing Gallery
What Software Do You Use?
This is obviously a small subset of what’s available. Ultimately, visualization is not just about using one piece of software, but having a full toolbox at your disposal.
Here’s a list of all the programs, tools, and resources I frequently use. What do you use?