ProPublica opened a data store

Posted to Data Sources  |  Tags:  |  Nathan Yau

One of the main challenges of any data project is getting the data. It seems obvious, but the effort to get the right data to answer a question seems to catch people off guard. Even data that's "free" to download can be a huge pain that ends up completely useless. ProPublica, the non-profit newsroom, deals with this stuff on a regular basis and hopes that some of their efforts can turn into a source of funding through the Data Store.

Like most newsrooms, we make extensive use of government data — some downloaded from "open data" sites and some obtained through Freedom of Information Act requests. But much of our data comes from our developers spending months scraping and assembling material from web sites and out of Acrobat documents. Some data requires months of labor to clean or requires combining datasets from different sources in a way that's never been done before.

In the Data Store you'll find a growing collection of the data we've used in our reporting. For raw, as-is datasets we receive from government sources, you'll find a free download link that simply requires you agree to a simplified version of our Terms of Use. For datasets that are available as downloads from government websites, we've simply linked to the sites to ensure you can quickly get the most up-to-date data.

For datasets that are the result of significant expenditures of our time and effort, we're charging a reasonable one-time fee: In most cases, it's $200 for journalists and $2,000 for academic researchers.

I hope it works.

Favorites

Famous Movie Quotes as Charts

In celebration of their 100-year anniversary, the American Film Institute selected the 100 most memorable quotes from American cinema, and …

The Most Unisex Names in US History

Moving on from the most trendy names in US history, let’s look at the most unisex ones. Some names have …

Jobs Charted by State and Salary

Jobs and pay can vary a lot depending on where you live, based on 2013 data from the Bureau of Labor Statistics. Here’s an interactive to look.

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.