For reasons of autonomy, control, and privacy, Benjamin Mako Hill runs his own…
Statistics
More than mean and standard deviation.

A majority of your email in Gmail, even if you don’t use it

Newborn false positives
Shutterfly sent promotional emails that congratulate new parents and encourage them to send…

Random things that correlate
This is fun. Tyler Vigen wrote a program that attempts to automatically find…

Type I and II errors simplified
“Type I” and “Type II” errors, names first given by Jerzy Neyman and…

Naked Statistics
Naked Statistics by Charles Wheelan promises a fun, nonboring introduction to statistics that…

Most underrated films
Ben Moore was curious about overrated and underrated films. “Overrated” and “underrated” are…

Hip hop vocabulary compared between artists
Matt Daniels compared rappers’ vocabularies to find out who knows the most words.…

Hiding a pregnancy from advertisers
You probably remember how Target used purchase histories to predict pregnancies among their…

A principal component analysis stepbystep
Sebastian Raschka offers a stepbystep tutorial for a principal component analysis in Python.…

Analysis of Bob Ross paintings
As a lesson on conditional probability for himself, Walt Hickey watched 403 episodes…

Porn views for red versus blue states
Pornhub continues their analysis of porn viewing demographics in their latest comparison of…

Using Census survey data properly
The American Community Survey, an ongoing survey that the Census administers to millions…

Bracket picks of the masses versus sports pundits
Stephen Pettigrew and Reuben FischerBaum, for Regressing, compared 11 million brackets on ESPN.com…

Fox News bar chart gets it wrong
Because Fox News. See also this, this, and this. [Thanks, Meron]…

Big data, same statistical challenges
Tim Harford for Financial Times on big data and how the same problems…

Bike share data in New York, animated
Citi Bike, also known as NYC Bike Share, is releasing monthly data dumps…

Dead links on the Million Dollar Homepage
Remember the Million Dollar Homepage from 2005? It sold ad space to anyone…

Gambling data as a proxy for excitement in sports
After he noticed gambling odds fluctuate wildly at the end of a football…

Where time comes from
The Atlantic interviewed Dr. Demetrios Matsakis, Chief Scientist for Time Services at the…

How people really read and share online
Tony Haile discusses how we read and share online, based on actual data.…

The important parts of data analysis
There’s plenty of software to muck around with data, but to gain the…

Statistical concepts explained through dance
Forget bell curves, jellybeans, and coin flips to explain statistical concepts. Dancing Statistics…

ProPublica opened a data store
One of the main challenges of any data project is getting the data.…

Game theory to win game shows
I like how a little bit of game theory has crept into Jeopardy!…

A visual explanation of conditional probability
Victor Powell, who has visualized the Central Limit Theorem and Simpson’s Paradox, most…

Basketball analytics
Kirk Goldsberry talks the rise of analytics usage in the NBA. With cameras…

Texting data to save lives
Remember that TED talk from a couple of years ago on texting patterns…

How R came to be
Statistician John Chambers, the creator of S and a core member of R,…

Facebook debunks Princeton study
Researchers at Princeton released a study that said that Facebook was on the…

Using data to find a girlfriend
Remember when Amy Webb created a bunch of fake male profiles to scrape…