Posted to

Statistics

A collection of small datasets

Sometimes you need data, any data, to test or mess around with. Sometimes you just want to make weird crap. Corpora is a collection of…

Unclaimed remains

People die, and for various reasons many bodies go unclaimed. In Los Angeles county, the bodies go to the county crematory. The Los Angeles Times…

Jeopardy! clues data

Here's some weekend project data for you. Reddit user trexmatt dumped a dataset for 216,930 Jeopardy! questions and answers in JSON and CSV formats, a…

Prostitution, GDP, and £1.7 billion due

David Spiegelhalter, professor of public understanding of risk, does some back-of-the-napkin math to describe why recent prostitution estimates for the UK are problematic. As always,…

Statistically ignorant

Ipsos MORI, primarily a marketing research group I think, released results of their study on public perception of demographics versus reality, on numbers such as…

Fallacy of point-and-click analysis

Jeff Leek touches on concerns about point-and-click software to find the insights in your data, magically and with little to no effort. I understand the…

Deviations from the mean

As a way to bring context to the rarity of the 18-inning baseball game between the Washington Nationals and the San Francisco Giants this past…

A simulation of the traveling salesman problem

In a nutshell, the traveling salesman problem is as follows: "Given a list of cities and the distances between each pair of cities, what is…

Evolution of movies

We know that movies have changed over the decades. We've seen it in declining ratings and box office hits versus Oscar winners. However, these are…

How to be not ignorant about the world

Recurring TED talker Hans Rosling returns with his son and Gapminder Foundation co-founder Ola Rosling and an update on their latest work: the Ignorance Project.…

Emotional dynamics of literary classics

As a demonstration of efforts in estimating happiness from language, Hedonometer charts emotion over time for literary classics. The above is the collection of charts…

Unintentional Venn diagram suggests opposite meaning

Most people probably wouldn't think much about this poster that shows the values of Thomson Reuters. But when you think of the graphic as a…

Not automatic

It's an absolute myth that you can send an algorithm over raw data and have insights pop up. — Jeffrey Heer in For Big-Data Scientists,…

Crisis Text Line releases trends and data

Crisis Text Line is a service that troubled teens can use to find help with suicidal thoughts, depression, anxiety, and other issues via text messaging.…

How charity: water uses data to do more good

Scott Harrison, the founder and CEO of charity: water, describes how the organization uses data to improve what they do, both on the ground and…

Variance be damned

Daniel Colman won $15.3 million in the The Big One for One Drop poker tournament, but he seems annoyed about it. What he had…

Geography.

By way of David Kennerr, something in this CNN frame seems off.…

Markov Chains explained visually

Adding on to their series of graphics to explain statistical concepts, Victor Powell and Lewis Lehe use a set of interactives to describe Markov Chains.…

Visual Microphone estimates sound from vibrations in objects

A group of researchers from MIT, Microsoft Research, and Adobe Research are experimenting with seemingly inanimate objects as a proxy for sound in the…

This is Statistics

Statistics has an image problem. To the general public, it's old, out of touch, and boring. It's a problem because we place stock in a…

How well we don’t understand probability

All Things Considered on NPR ran a fine series on how we interpret probability and uncertainty. It came in five bits (plus one follow-up), each…

A more visual world data portal

One of the most annoying parts of downloading data from large portals is that you never quite know what you're gonna get. It's a box…