Using data to find a girlfriend

Posted to Statistics  |  Tags: , ,  |  Nathan Yau

Remember when Amy Webb created a bunch of fake male profiles to scrape data from two dating sites and analyze it to find a husband? Mathematician Chris McKinlay took a similar route to find a girlfriend (and now fiancee). However, unlike Webb who used a relatively small sample, McKinlay scraped data for thousands of profiles in his area and analyzed the data more thoroughly, in search of the perfect mate.

For McKinlay’s plan to work, he’d have to find a pattern in the survey data—a way to roughly group the women according to their similarities. The breakthrough came when he coded up a modified Bell Labs algorithm called K-Modes. First used in 1998 to analyze diseased soybean crops, it takes categorical data and clumps it like the colored wax swimming in a Lava Lamp. With some fine-tuning he could adjust the viscosity of the results, thinning it into a slick or coagulating it into a single, solid glob.

He played with the dial and found a natural resting point where the 20,000 women clumped into seven statistically distinct clusters based on their questions and answers. “I was ecstatic,” he says. “That was the high point of June.”

He selected the two clusters most to his liking, looked at what interested the women, and then adjusted his profile accordingly. He didn’t lie. He just emphasized the traits that he possessed and that women tended to like. Then he waited for women to notice him.

It’s kind of like he built a targeted advertising system for himself and then cast a really wide net. Even though McKinlay is engaged now, I still wonder if it actually worked or if something similar might have happened if he left it to chance. I like to believe in the latter. He did after all go on dates with 87 other people before finding a match.

Favorites

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.

Where People Run in Major Cities

There are many exercise apps that allow you to keep track of your running, riding, and other activities. Record speed, …

Who is Older and Younger than You

Here’s a chart to show you how long you have until you start to feel your age.

The Best Data Visualization Projects of 2014

It’s always tough to pick my favorite visualization projects. Nevertheless, I gave it a go.