Posted to

Statistics

Being dangerous

Think big data, and it's tough not to associate it with big corporations who have their own interests in mind. Use data. Make money. It doesn't have to always be…

Genetic algorithm walkers

If you ever wondered what it looks like when QWOP-like figures learn to walk through mutation as dictated by a simplified genetic algorithm, here's your answer. Rafael Matsunaga made a…

Basic chart, wrong conclusions

A short post on Bloomberg from 2013 describes the fall of U.S. mens' income for the past forty years. To illustrate, the author uses the chart above, and we're like,…

Translating images to words

With Google's image search, the results kind of exist in isolation. There isn't a ton of context until you click through to see how an image is placed among words.…

Machine learning podcast

I'm glad podcasts are a thing right now. Talking Machines is a new podcast on machine learning, statistics, and data, hosted by journalist Katherine Gorman and computer science professor Ryan…

Fake correlation

Gabriel Rossman, a sociology professor at UCLA, describes colliders — or when correlation does not equal causation and the former might not even exist either. Referring to the simulated plot…

$5.2 million in extra cab tips, found in public data

A few months ago BusinessWeek ran an article on how much people tip New York cab drivers. There are bumps in 20%, 25%, and 30%, which is expected because those…

Statisticians in World War II

The Economist recounts the stories of statisticians who solved problems during wartime. Although they weren't called that until after. "Peace finally returned, and the statistical scene in the United Kingdom…

Inadvertent algorithmic cruelty

If you logged into Facebook the past couple of weeks, you saw your friends' automatically generated year-end reviews. Estimated events and popular pictures appear in chronological order. Facebook eventually pinned…

When data gets creepy

You make and publish bits of data about yourself, intentionally and unintentionally, and it goes to the indexed public web or to companies' private black boxes. Ben Goldacre explains why…

Revealing location history via your phone’s Wi-Fi

When you have your phone's Wi-Fi turned on, even if you're not connected to anything, you broadcast the networks you've connected to, which in turn can reveal your location history.…

A collection of small datasets

Sometimes you need data, any data, to test or mess around with. Sometimes you just want to make weird crap. Corpora is a collection of small datasets that might suit…

Unclaimed remains

People die, and for various reasons many bodies go unclaimed. In Los Angeles county, the bodies go to the county crematory. The Los Angeles Times reports, along with a searchable…

Jeopardy! clues data

Here's some weekend project data for you. Reddit user trexmatt dumped a dataset for 216,930 Jeopardy! questions and answers in JSON and CSV formats, a scrape from the J! Archive.…

Prostitution, GDP, and £1.7 billion due

David Spiegelhalter, professor of public understanding of risk, does some back-of-the-napkin math to describe why recent prostitution estimates for the UK are problematic. As always, it's best to do a…

Statistically ignorant

Ipsos MORI, primarily a marketing research group I think, released results of their study on public perception of demographics versus reality, on numbers such as immigration, religion, and life expectancy.…

Fallacy of point-and-click analysis

Jeff Leek touches on concerns about point-and-click software to find the insights in your data, magically and with little to no effort. I understand the sentiment, there is a bunch…

Deviations from the mean

As a way to bring context to the rarity of the 18-inning baseball game between the Washington Nationals and the San Francisco Giants this past weekend, Ross Benes compared other…

A simulation of the traveling salesman problem

In a nutshell, the traveling salesman problem is as follows: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that…

Evolution of movies

We know that movies have changed over the decades. We've seen it in declining ratings and box office hits versus Oscar winners. However, these are changes that come along with…

How to be not ignorant about the world

Recurring TED talker Hans Rosling returns with his son and Gapminder Foundation co-founder Ola Rosling and an update on their latest work: the Ignorance Project. They hope to shift people's…

Emotional dynamics of literary classics

As a demonstration of efforts in estimating happiness from language, Hedonometer charts emotion over time for literary classics. The above is the collection of charts for Adventures of Huckleberry Finn…