Posted to

Statistics

What data brokers know about you

Lois Beckett for ProPublica has a thorough piece on data brokers — companies that collect and sell information about you — and what they know…

Using search data to find drug side effects

Along the same lines as Google Flu Trends, researchers at Microsoft, Stanford and Columbia University are investigating whether search data can be used to find…

Netflix data and puppets

Andrew Leonard for Salon fears what might come of the creative process if movies are based on algorithms and data and that we might turn…

This pie chart is amazing.

From the Winnipeg Sun. Something isn't right here. [via]…

Porn star demographics

Jon Millward explored porn star demographics using a data scrape from the Internet Adult Film Database: hair color, race, and birthplace, among other things. (There…

Analysis of LEGO brick prices over the years

Reality Prose has an excellent analysis on the changing price of LEGO bricks over the years and a misconception that cost has gone up. According…

Philosophy of data

David Brooks for The New York Times on the philosophy of data and what the future holds: If you asked me to describe the rising…

The most poisoned name in US history

Biostatistics PhD candidate Hilary Parker dived into the most poisoned names in US history. Her own name topped the list. There were several fad names…

Using data to find a husband

When it was time to settle down with the right man, Amy Webb joined two dating sites, created a profile, and went on some horrible…

Data Analysis (with R) on Coursera

Jeff Leek, an Assistant Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health, is teaching a course on data analysis on…

Statistical network of basketball

By now, everyone's heard of Moneyball. Applying statistics to baseball to build the best team for the buck. Naturally, there's a lot of interest these…

The differences between machine learning, data mining, and statistics

From machine learning to data mining. From statistics to probability. A lot of it seems similar, so what are the differences? Statistician William Briggs explains…

A new kind of resource

Jer Thorp talks ethics in the data-as-new-oil metaphor: [W]e need to change the way that we collectively think about data, so that it is not…

Machines and built-in morality

With Google's driverless cars now street legal in California, Florida, and Nevada, Gary Marcus for the New Yorker ponders a world where machines need a…

Archive of datasets bundled with R

R comes with a lot of datasets, some with the core distribution and others with packages, but you'd never know which ones unless you went…

Incredibly divided nation in a map

I knew things were bad, but I didn't know they were this bad. Obama has his work cut out for him. [Thanks, @adamsinger]…

How Silver predictions performed

By way of Rafa Irizarry from Simply Statistics, a plot of Nate Silver's probabilities for Barack Obama winning a state versus the percentage of vote…

A quick lesson on making predictions

Political analyst and statistician Nate Silver has gotten some flack lately for consistently projecting a 70-plus percent chance of a Barack Obama win this election.…

Data on decades of Boy Scout expulsions released

The Los Angeles Times released nearly 5,000 records of allegations from the Boy Scouts of America as a browseable map and searchable list. You can…

The birthday problem explained

How many people does it take for there to be a 50% chance that a pair in the group has the same birthday? Only 23…

Data for good, not bad

I'm so glad there are people like Jake Porway in the world. The founder and executive director of DataKind gives his quick pitch on "using…

Hiring a data scientist

Thomas H. Davenport and D.J. Patil give the rundown on what a data scientist is, what to look for and how to hire them. It's…