How open data saved $3.2 billion

Posted to Statistics  |  Nathan Yau

This is a story of fake charities and tax shelters. In an analysis of data from the Canada Revenue Agency (CRA), it was found that billions of dollars in donations were collected by fraudulent organizations, with only a tiny portion going to the actual causes. In one case, only $1 out of every $100 went to helping the homeless. The rest of the money went to a tax shelter. Shameful.

All told, my colleague estimated that these illegally operating charities alone sheltered roughly half a billion dollars in 2005. Indeed, newspapers later confirmed that in 2007, fraudulent donations were closer to a billion dollars a year, with some 3.2 billion dollars illegally sheltered, a sum that accounts for 12% of all charitable giving in Canada.

Not only did this lead to the exposure of fraud, but also negligence on the part of the CRA charity division (now under new leadership). How did this go on for so long? A simple sort on the data would have raised questions immediately. Instead, it took a freelance consultant, poking around out of curiosity, and journalists, who were aware of fishy behavior, to move things along.

[via @datamarket]


  • it’s an interesting story but I come with a different conclusion. opening tax data to the public has very severe privacy implications, which in my book considerably outweigh the potential benefits of possibly finding a problem. this is also unleashing an army of self-righteous data vigilantes who could attack individuals or organizations with complex, but legal (and moral) tax strategies.
    however, it’s inexcusable that CRA wasn’t able to detect fraud of that caliber by themselves. a fraud which was defeated not by exotic statistical analysis, but by just laying down 20 numbers in an excel spreadsheet. How come CRA didn’t even try to find where their missing billion dollars went?

  • yeah, i don’t think individuals’ tax receipts should be released, but maybe stuff that’s the equivalent of FEC data should be transparent.

    but like you said, i thought the worst part was the CRA’s non-action. it seems like it wasn’t even a matter of advanced analysis or anything – just some simple sorting and some skepticism as to why these organizations no one has heard of are bringing in millions.


Best Data Visualization Projects of 2016

Here are my favorites for the year.

How You Will Die

So far we’ve seen when you will die and how other people tend to die. Now let’s put the two together to see how and when you will die, given your sex, race, and age.

Real Chart Rules to Follow

There are rules—usually for specific chart types meant to be read in a specific way—that you shouldn’t break. When they are, everyone loses. This is that small handful.

Unemployment in America, Mapped Over Time

Watch the regional changes across the country from 1990 to 2016.