How open data saved $3.2 billion

Posted to Statistics  |  Nathan Yau

This is a story of fake charities and tax shelters. In an analysis of data from the Canada Revenue Agency (CRA), it was found that billions of dollars in donations were collected by fraudulent organizations, with only a tiny portion going to the actual causes. In one case, only $1 out of every $100 went to helping the homeless. The rest of the money went to a tax shelter. Shameful.

All told, my colleague estimated that these illegally operating charities alone sheltered roughly half a billion dollars in 2005. Indeed, newspapers later confirmed that in 2007, fraudulent donations were closer to a billion dollars a year, with some 3.2 billion dollars illegally sheltered, a sum that accounts for 12% of all charitable giving in Canada.

Not only did this lead to the exposure of fraud, but also negligence on the part of the CRA charity division (now under new leadership). How did this go on for so long? A simple sort on the data would have raised questions immediately. Instead, it took a freelance consultant, poking around out of curiosity, and journalists, who were aware of fishy behavior, to move things along.

[via @datamarket]


  • it’s an interesting story but I come with a different conclusion. opening tax data to the public has very severe privacy implications, which in my book considerably outweigh the potential benefits of possibly finding a problem. this is also unleashing an army of self-righteous data vigilantes who could attack individuals or organizations with complex, but legal (and moral) tax strategies.
    however, it’s inexcusable that CRA wasn’t able to detect fraud of that caliber by themselves. a fraud which was defeated not by exotic statistical analysis, but by just laying down 20 numbers in an excel spreadsheet. How come CRA didn’t even try to find where their missing billion dollars went?

  • yeah, i don’t think individuals’ tax receipts should be released, but maybe stuff that’s the equivalent of FEC data should be transparent.

    but like you said, i thought the worst part was the CRA’s non-action. it seems like it wasn’t even a matter of advanced analysis or anything – just some simple sorting and some skepticism as to why these organizations no one has heard of are bringing in millions.


Real Chart Rules to Follow

There are rules—usually for specific chart types meant to be read in a specific way—that you shouldn’t break. When they are, everyone loses. This is that small handful.

Top Brewery Road Trip, Routed Algorithmically

There are a lot of great craft breweries in the United States, but there is only so much time. This is the computed best way to get to the top rated breweries and how to maximize the beer tasting experience. Every journey begins with a single sip.

Marrying Age

People get married at various ages, but there are definite trends that vary across demographic groups. What do these trends look like?

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.