How Netflix creates movie micro-genres

Posted to Statistics  |  Tags: , ,  |  Nathan Yau

Alexis Madrigal and Ian Bogost for The Atlantic reverse engineered the Netflix genre generator, analyzed the data, and then made their own. Then they talked to Todd Yellin, the guy at Netflix who created the micro-genre system. It’s no accident when you see altgenres like “Visually-striking Goofy Action & Adventure” and “Sentimental set in Europe Dramas from the 1970s” in your browser.

The Netflix Quantum Theory doc spelled out ways of tagging movie endings, the “social acceptability” of lead characters, and dozens of other facets of a movie. Many values are “scalar,” that is to say, they go from 1 to 5. So, every movie gets a romance rating, not just the ones labeled “romantic” in the personalized genres. Every movie’s ending is rated from happy to sad, passing through ambiguous. Every plot is tagged. Lead characters’ jobs are tagged. Movie locations are tagged. Everything. Everyone.

That’s the data at the base of the pyramid. It is the basis for creating all the altgenres that I scraped. Netflix’s engineers took the microtags and created a syntax for the genres, much of which we were able to reproduce in our generator.

Be sure to play around with Bogost’s generator at the top. It will amuse.


Real Chart Rules to Follow

There are rules—usually for specific chart types meant to be read in a specific way—that you shouldn’t break. When they are, everyone loses. This is that small handful.

Shifting Incomes for American Jobs

For various occupations, the difference between the person who makes the most and the one who makes the least can be significant.

Interactive: When Do Americans Leave For Work?

We don’t all start our work days at the same time, despite what morning rush hour might have you think.

The Best Data Visualization Projects of 2014

It’s always tough to pick my favorite visualization projects. Nevertheless, I gave it a go.