Kaggle Datasets for a place to converge on public data

Posted to Data Sources  |  Tags: ,  |  Nathan Yau

Kaggle just opened up a Datasets section to download and analyze public data.

At Kaggle, we want to help the world learn from data. This sounds bold and grandiose, but the biggest barriers to this are incredibly simple. It’s tough to access data. It’s tough to understand what’s in the data once you access it. We want to change this. That’s why we’ve created a home for high quality public datasets, Kaggle Datasets.

It’s still really new and only has a handful of datasets but it looks interesting. The key is that it’s not just a place to download data. Instead, they have analysis environments and make it easy to share code that makes use of the data. They also make it easy to share results.

Oftentimes, it’s the getting-started hurdle that gets in the way of working with a large-ish dataset. Maybe this will help set things on the right path.

Favorites

Watching the growth of Walmart – now with 100% more Sam’s Club

The ever so popular Walmart growth map gets an update, and yes, it still looks like a wildfire. Sam’s Club follows soon after, although not nearly as vigorously.

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.

Unemployment in America, Mapped Over Time

Watch the regional changes across the country from 1990 to 2016.