# A quick lesson on making predictions

October 31, 2012

### Topic

Statistics  /  , ,

Political analyst and statistician Nate Silver has gotten some flack lately for consistently projecting a 70-plus percent chance of a Barack Obama win this election. But as Jeff Leek explains, the criticism doesn’t spawn from Silver being wrong. Rather, it comes from the critics’ misunderstanding of statistics. Leek provides a quick lesson on how Silver makes his predications and how the methods apply to other things, like the weather.

Now, this might seem like a goofy way to come up with a “percent chance” with simulated elections and all. But it turns out it is actually a pretty important thing to know and relevant to those of us on the East Coast right now. It turns out weather forecasts (and projected hurricane paths) are based on the same sort of thing — simulated versions of the weather are run and the “percent chance of rain” is the fraction of times it rains in a particular place.

So Romney may still win and Obama may lose — and Silver may still get a lot of it right. But regardless, the approach taken by Silver is not based on politics, it is based on statistics.

Don’t fear the black box.

• There’s something about probabilities in the 60-75% range that people just don’t get.

I guess it’s because we’re used to things like polls and political divides that are newsworthy and controversial being just a few % points away from 50% – and in that context, when counting things, and breaking down actual distributions, 60-75% is a strong powerful rare majority.

If there are 100 people in the room, and 53 of them are wearing blue shirts and 47 wearing red shirts (a small blue majority) eyeballing it, you’d have a feeling there were more blue shirts than red shirts. How confident would you be that you were right? Somewhat – you’d trust your judgement substantially more than a coin flip or guess – but not very. Somewhere in the 60-75% range.

That’s very very different to thinking there are 60-75% blue shirts. If there was such a strong majority of blue shirts, you’d be almost 100% certain that there were more blue than red.

A 60-75% distribution is a big, solid lean one way. A 60-75% probability is a small lean one way. It’s like rolling a 1,2,3 or 4 on a 6-sided dice. You might bet a little on it, but you wouldn’t bet much.

• I think in general, communicating odds as natural frequencies, with percentages as a more accurate backup, helps reduce confusion.

“The chances of Obama winning the election are 72%” sounds like a big, solid lead.

“The chances of Obama winning the election are around 7 in 10 (72%)” frames it in a way which helps you better appreciate that, while it’s more likely to happen than to not happen, it’s not something to bet your house on.

Also, this way does not invite comparison with other %-communicated figures like expected share of the vote.

• A fun exercise is to run Nate Silver through his own black box, asking “Is he biased’ as you work through is book (answer: 41% odds that he’s biased). Regardless, the bayesian template in his book is outstanding, simple and clear.

• Raphael

The criticism is also based on the assumption that Nate Silver is gay: http://www.queerty.com/right-wing-analyst-says-nate-silver-is-too-much-of-a-sissy-to-crunch-poll-data-20121031/

When you’re a bigot, numbers and facts don’t mean much to you.

• Only over time (multiple election cycles) will we be able to evaluate Silver’s predictions since they are inherently probabilistic. Whether or not Romney or Obama wins will not prove the ~70% figure wrong. I think most people probably don’t get that.