Newborn false positives

Posted to Mistaken Data  |  Tags: ,  |  Nathan Yau

Shutterfly sent promotional emails that congratulate new parents and encourage them to send thank you cards. The problem: a lot of people on that list weren’t new parents.

Several tipsters forwarded us the email that Shutterfly sent out in the wee small hours of this morning. One characterized the email as “data science gone wrong.” Another says that she had actually been pregnant and would have been due this month, but miscarried six months ago. Is it possible that Shutterfly analyzed her search data and just happened to conclude, based on that, that she would be welcoming a child around this time? Or is it, as she hoped via email, “just a horrible coincidence?”

Only Shutterfly knows what actually happened (They insist it was a random mistake.), but it sounds like a naive use of data somewhere in the pipeline. Maybe someone remembered the Target story, got excited, and forgot about the repercussions of false positives. Or, maybe someone made an incorrect assumption about data points with certain purchases and didn’t test thoroughly enough.

In any case, this slide suddenly takes on new meaning.

Favorites

Think Like a Statistician – Without the Math

I call myself a statistician, because, well, I’m a statistics graduate student. However, the most important things I’ve learned are less formal, but have proven extremely useful when working/playing with data.

Where Bars Outnumber Grocery Stores

A closer look at the age old question of where there are more bars than grocery stores, and vice versa.

The Best Data Visualization Projects of 2011

I almost didn’t make a best-of list this year, but as I clicked through the year’s post, it was hard …

10 Best Data Visualization Projects of 2015

These are my picks for the best of 2015. As usual, they could easily appear in a different order on a different day, and there are projects not on the list that were also excellent.