Posted to

Statistics

Madden ratings formula

In football video game Madden, NFL players are scored based on skill, which determines how they play in the game. Neil Paine, with graphics by Reuben Fischer-Baum, describes more than…

Automated Tinder and the Eigenface

Because using Tinder takes up oh so much time swiping, swiping, and swiping, Justin Long made a bot that swipes and starts conversations for him. Step 1: Use his existing…

Trust engineers

The most recent episode of RadioLab is on social experimentation and social networks. More specifically, Facebook and their timeline tinkering.…

Pac-Man ghost algorithms

The ghosts in Pac-Man have different personalities represented by their search technique. For example, Pinky tries to predict where you will be in four moves. I had no idea. Game/Show…

Conflicting views: Public versus scientists

Pew Research Center released a report that compares the public and scientists' views on science and society. On some things, such as the space station, fracking, and bioengineered fuel, U.S.…

Questionable fumble statistics for Deflate-Gate

A data-centric look at New England Patriots fumble rates at home made the rounds this week. The most cited tidbit was that there is only a 1 in 16,233 chance…

Being dangerous

Think big data, and it's tough not to associate it with big corporations who have their own interests in mind. Use data. Make money. It doesn't have to always be…

Genetic algorithm walkers

If you ever wondered what it looks like when QWOP-like figures learn to walk through mutation as dictated by a simplified genetic algorithm, here's your answer. Rafael Matsunaga made a…

Basic chart, wrong conclusions

A short post on Bloomberg from 2013 describes the fall of U.S. mens' income for the past forty years. To illustrate, the author uses the chart above, and we're like,…

Translating images to words

With Google's image search, the results kind of exist in isolation. There isn't a ton of context until you click through to see how an image is placed among words.…

Machine learning podcast

I'm glad podcasts are a thing right now. Talking Machines is a new podcast on machine learning, statistics, and data, hosted by journalist Katherine Gorman and computer science professor Ryan…

Fake correlation

Gabriel Rossman, a sociology professor at UCLA, describes colliders — or when correlation does not equal causation and the former might not even exist either. Referring to the simulated plot…

$5.2 million in extra cab tips, found in public data

A few months ago BusinessWeek ran an article on how much people tip New York cab drivers. There are bumps in 20%, 25%, and 30%, which is expected because those…

Statisticians in World War II

The Economist recounts the stories of statisticians who solved problems during wartime. Although they weren't called that until after. "Peace finally returned, and the statistical scene in the United Kingdom…

Inadvertent algorithmic cruelty

If you logged into Facebook the past couple of weeks, you saw your friends' automatically generated year-end reviews. Estimated events and popular pictures appear in chronological order. Facebook eventually pinned…

When data gets creepy

You make and publish bits of data about yourself, intentionally and unintentionally, and it goes to the indexed public web or to companies' private black boxes. Ben Goldacre explains why…

Revealing location history via your phone’s Wi-Fi

When you have your phone's Wi-Fi turned on, even if you're not connected to anything, you broadcast the networks you've connected to, which in turn can reveal your location history.…

A collection of small datasets

Sometimes you need data, any data, to test or mess around with. Sometimes you just want to make weird crap. Corpora is a collection of small datasets that might suit…

Unclaimed remains

People die, and for various reasons many bodies go unclaimed. In Los Angeles county, the bodies go to the county crematory. The Los Angeles Times reports, along with a searchable…

Jeopardy! clues data

Here's some weekend project data for you. Reddit user trexmatt dumped a dataset for 216,930 Jeopardy! questions and answers in JSON and CSV formats, a scrape from the J! Archive.…

Prostitution, GDP, and £1.7 billion due

David Spiegelhalter, professor of public understanding of risk, does some back-of-the-napkin math to describe why recent prostitution estimates for the UK are problematic. As always, it's best to do a…

Statistically ignorant

Ipsos MORI, primarily a marketing research group I think, released results of their study on public perception of demographics versus reality, on numbers such as immigration, religion, and life expectancy.…