Fake correlation

Posted to Statistics  |  Tags: ,  |  Nathan Yau

Gabriel Rossman, a sociology professor at UCLA, describes colliders — or when correlation does not equal causation and the former might not even exist either. Referring to the simulated plot above, Rossman uses Hollywood actor selection as an example:

For instance, suppose that in a population of aspiring Hollywood actors there is no correlation between acting ability and physical attractiveness. However assume that we generally pay a lot more attention to celebrities than to some kid who is waiting tables while going on auditions. That is, we can not readily observe people who aspire to be actors, but only those who actually are actors. This implies that we need to understand the selection process by which people get cast into films. In the computer simulation displayed below I generated a population of aspiring actors characterized by “body” and “mind,” each of which follows a normal distribution and with these two traits being completely orthogonal to one another. Then imagine that casting directors jointly maximize talent and looks so only the aspiring actors with the highest sum for these two traits actually get work in Hollywood. I have drawn the working actors as triangles and the failed aspirants as hollow circles. Among those actors we can readily observe there then will be a negative correlation between looks and talent, even though there is no such correlation in the grand population. If we see only the working actors without understanding the censorship process we might think that there is some stupefaction of being ridiculously good-looking.

Favorites

Pizza Place Geography

Most of the major pizza chains are within a 5-mile radius of where I live, so I have my pick, …

How to Spot Visualization Lies

Many charts don’t tell the truth. This is a simple guide to spotting them.

The Changing American Diet

See what we ate on an average day, for the past several decades.

Real Chart Rules to Follow

There are rules—usually for specific chart types meant to be read in a specific way—that you shouldn’t break. When they are, everyone loses. This is that small handful.