ProPublica opened a data store

Posted to Data Sources  |  Tags:  |  Nathan Yau

One of the main challenges of any data project is getting the data. It seems obvious, but the effort to get the right data to answer a question seems to catch people off guard. Even data that’s “free” to download can be a huge pain that ends up completely useless. ProPublica, the non-profit newsroom, deals with this stuff on a regular basis and hopes that some of their efforts can turn into a source of funding through the Data Store.

Like most newsrooms, we make extensive use of government data — some downloaded from “open data” sites and some obtained through Freedom of Information Act requests. But much of our data comes from our developers spending months scraping and assembling material from web sites and out of Acrobat documents. Some data requires months of labor to clean or requires combining datasets from different sources in a way that’s never been done before.

In the Data Store you’ll find a growing collection of the data we’ve used in our reporting. For raw, as-is datasets we receive from government sources, you’ll find a free download link that simply requires you agree to a simplified version of our Terms of Use. For datasets that are available as downloads from government websites, we’ve simply linked to the sites to ensure you can quickly get the most up-to-date data.

For datasets that are the result of significant expenditures of our time and effort, we’re charging a reasonable one-time fee: In most cases, it’s $200 for journalists and $2,000 for academic researchers.

I hope it works.

Favorites

Where Bars Outnumber Grocery Stores

A closer look at the age old question of where there are more bars than grocery stores, and vice versa.

19 Maps That Will Blow Your Mind and Change the Way You See the World. Top All-time. You Won’t Believe Your Eyes. Watch.

Many lists of maps promise to change the way you see the world, but this one actually does.

Think Like a Statistician – Without the Math

I call myself a statistician, because, well, I’m a statistics graduate student. However, the most important things I’ve learned are less formal, but have proven extremely useful when working/playing with data.

Graphical perception – learn the fundamentals first

Before you dive into the advanced stuff – like just about everything in your life – you have to learn the fundamentals before you know when you can break the rules.