Extract CSV data from PDF files with Tabula

Posted to Software  |  Tags: ,  |  Nathan Yau

Tabula, by Manuel AristarĂ¡n, came out months ago, but I've been poking at government data recently and came back to this useful piece of free software to get the data tables out of countless free-floating PDF files.

If you've ever tried to do anything with data provided to you in PDFs, you know how painful this is — you can't easily copy-and-paste rows of data out of PDF files. Tabula allows you to extract that data in CSV format, through a simple interface.

It's not the fastest software in the world, but it really is simple to use and it sure beats manual entry. You just load a PDF file into Tabula, which runs on your computer, highlight the table to extract, and the program does the rest. Save as a CSV and do what you want with it.

Download Tabula here. Find out a little more about it on Source.

Favorites

Famous Movie Quotes as Charts

In celebration of their 100-year anniversary, the American Film Institute selected the 100 most memorable quotes from American cinema, and …

Where People Run in Major Cities

There are many exercise apps that allow you to keep track of your running, riding, and other activities. Record speed, …

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.

19 Maps That Will Blow Your Mind and Change the Way You See the World. Top All-time. You Won’t Believe Your Eyes. Watch.

Many lists of maps promise to change the way you see the world, but this one actually does.