Gender prediction through trivia performance

Posted to Statistics  |  Tags: , , ,  |  Nathan Yau

Todd Schneider likes trivia, and he plays in an online league called LearnedLeague. Curious, Schneider wondered if there was anything interesting he could glean from the performance of the LLamas (Learned League members) that might apply to knowledge in general.

He looked at it from two angles. In the first, he simply calculated correlation coefficients between subjects. If you know world history, are you more likely to know geography? Yes. If you know math, are you more likely to be in tuned with pop culture? Probably not. The correlations aren’t too surprising, but the correlation strengths are fun to poke at.

The second angle: gender prediction through performance levels in various subjects.

LLamas optionally provide a bit of demographic information, including gender, location, and college(s) attended. It’s not lost on me that my category performance is pretty stereotypically “male.” For better or worse, my top 3 categories—business, math, and sports—are often thought of as male-dominated fields. That got me to wondering: does performance across categories predict gender?

As shown up top, Schneider used a decision tree and got decent results. [Thanks, Todd]

Favorites

Causes of Death

There are many ways to die. Cancer. Infection. Mental. External. This is how different groups of people died over the past 10 years, visualized by age.

Shifting Incomes for American Jobs

For various occupations, the difference between the person who makes the most and the one who makes the least can be significant.

Graphical perception – learn the fundamentals first

Before you dive into the advanced stuff – like just about everything in your life – you have to learn the fundamentals before you know when you can break the rules.

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.