Data Sources

Have fun and play with some numbers.

What the federal government has been buying and where from

Data Sources / coronavirus, procurement, ProPublica

The Federal Procurement Data System tracks federal contracts of $10,000 or more. For…
Coronavirus data at the state and county level, from The New York Times

Data Sources / coronavirus, New York Times

Comprehensive national data on Covid-19 has been hard to come by through government…
Restaurant struggles

Data Sources / coronavirus, OpenTable, restaurant

The restaurant industry is taking a big hit right now, as most people…
Nationwide database of credibly accused Catholic clergy

Data Sources / accused, Catholic, clergy, ProPublica

For ProPublica, Ellis Simani and Ken Schwencke compiled an interactive database that you…
Dataset for rejected license plate applications

Data Sources / license plate, Noah Veltman

Noah Veltman just posted a dataset of 23,463 personalized license plate applications that…
Google Dataset Search moves out of beta

Data Sources / datasets, Google, search

Over a year ago, Google released Dataset Search in public beta. The goal…
Scripts from The Office, the dataset

Data Sources / R, scripts, The Office

The decade is almost done. You’re sitting there and you’re thinking: “I wish…
Deaths from child abuse, a starting dataset

Data Sources / Boston Globe, children, deaths, ProPublica

By way of the Child Abuse Prevention and Treatment Act, ProPublica and The…
Sephora dataset is a collection of makeup reviews that mention crying

Data Sources / crying, makeup, Sephora

Interested in reviews on the Sephora website for waterproof makeup, Connie Ye figured…
PG&E providing shapefiles, instead of a working map for shutoffs

Data Sources / PG&E, shapefile

Here in northern California, PG&E is shutting off power to thousands of households…
Search through 3m nonprofit tax records

Data Sources / non-profit, ProPublica, search

ProPublica just released a search tool for nonprofit tax records:
The possibilities are…
Census data downloader to reformat for humans

Data Sources / census, formatting, Los Angeles Times

There is a lot of Census data. You can grab most of the…
Data for 200M traffic stop records

Data Sources / police, tickets, traffic

The Stanford Open Policing Project just released a dataset for police traffic stops…
Looking for common misspellings

Data Sources / Reddit, spelling

Some words are harder to spell than others, and on the internet, sometimes…
Google Dataset Search now in public beta

Data Sources / Google, search

Datasets are scattered across the web, tucked into cobwebbed corners where nobody can…
Download 3 million Russian troll tweets

Data Sources / election, trolls, Twitter

Oliver Roeder for FiveThirtyEight:
FiveThirtyEight has obtained nearly 3 million tweets from accounts…
Rush Hour puzzle solver and generator

Data Sources / game, Rush Hour

The Rush Hour puzzle game was invented by Nob Yoshigahara in the 1970s…
All the building footprints in the United States

Data Sources / buildings, Microsoft

Microsoft released a comprehensive dataset for computer-generated building footprints in the United States.…
Check if your school district or college was investigated for civil rights violations

Data Sources / civil rights, education, ProPublica

The U.S. Department of Education constantly investigates school districts and colleges for civil…
Datasets for teaching data science

Data Sources / R, teaching

Rafael Irizarry introduces the dslabs package for real-life datasets to teach data science:…

Page 3 of 10
<
1
2
3
4
5
6
...
10
>