Gender prediction through trivia performance

Posted to Statistics  |  Tags: , , ,  |  Nathan Yau

Todd Schneider likes trivia, and he plays in an online league called LearnedLeague. Curious, Schneider wondered if there was anything interesting he could glean from the performance of the LLamas (Learned League members) that might apply to knowledge in general.

He looked at it from two angles. In the first, he simply calculated correlation coefficients between subjects. If you know world history, are you more likely to know geography? Yes. If you know math, are you more likely to be in tuned with pop culture? Probably not. The correlations aren’t too surprising, but the correlation strengths are fun to poke at.

The second angle: gender prediction through performance levels in various subjects.

LLamas optionally provide a bit of demographic information, including gender, location, and college(s) attended. It’s not lost on me that my category performance is pretty stereotypically “male.” For better or worse, my top 3 categories—business, math, and sports—are often thought of as male-dominated fields. That got me to wondering: does performance across categories predict gender?

As shown up top, Schneider used a decision tree and got decent results. [Thanks, Todd]


One Dataset, Visualized 25 Ways

“Let the data speak” they say. But what happens when the data rambles on and on?

Causes of Death

There are many ways to die. Cancer. Infection. Mental. External. This is how different groups of people died over the past 10 years, visualized by age.

Life expectancy changes

The data goes back to 1960 and up to the most current estimates for 2009. Each line represents a country.

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.