# Open thread: Can you spot the wrongness in this tax graph?

May 17, 2011

### Topic

Mistaken Data  /  ,

The argument behind this graph in The Wall Street Journal is that the middle class has most of the money and ties into a larger argument about who should be taxed what. There is after all a spike in the middle. Is that really the case though? Sound off in the comments.

(Cheat sheet: Jonathan Chait explains what’s going on and Kevin Drum improves the graph to show more truth, although his graph can be improved, too. Grab the data here [Excel spreadsheet] from the IRS, and give it a go.)

• It’s an easy one :)
The dimension range is not constant on x axis, so representation of the actual data is totally wrong (1-5K and 200-500K are represented equally).

• It is worth noting though that most of the intervals come from the IRS data so there might be a good reason fir them (or not).

• Can you spot the wrongness in this one: http://tinyurl.com/6967qd3

• hmm. I guess that “Quintile” are not equal-sized data subsets?
I’m from Croatia and I don’t really know what is Second, and what middle quintile :)

• Eric

All of those using this graph (or their “improvements”) to prove their points fail to correct the error of the x-axis. None of the bins of the x-axis are equivalent, although they are represented visually as so. It’s a discrete versus continuous data problem.

• The structure graph implies that the total amount of tax paid by the middle class (across all people) should be less than the total amount of tax paid by the super rich (across all people), regardless of how many people are in each group. However, if the middle class has a million times as many people in it as the upper class, then perhaps the middle class’ total income tax might not necessarily need to be less than the total upper class’ income. The premise of the graph was flawed before any data ever entered the graph.

• Yeah, the x-axis on the graph isn’t the best of chocies, but their point that middle-income earners pay a large share is basically correct. Those in the \$100-\$200K range have a total taxable income of around \$1.3B while everyone above that totals just over \$2B.

• This is a repeat of the graph lie. Median income is \$44k, no where near \$150k. Dicing it up into quintiles might be better, or some other uniform interval on the x-axis. But you are correct it shows where tax payments are concentrated, it’s just not the middle class, except for readers in new york or the likely income range of subscribers to the Wall Street Journal.

• I would like to defend this chart.
although I agree that categories do not have the same size, both in terms of numbers of taxpayers and range of the income bracket, this construction allowed the author to make a point, which is to highlight that the people earning \$100k-200k represent a “substantial” total.

the goal of the author wasn’t to show a neutral analysis of the income distribution in the USA, but instead to support an article (in the “opinion” section, no less) which attempts to debunk the myth that taxing the richest is enough to get out of deficit.

• Mike N

The chart gives no indication of what the total taxable income should be or balance to. Visually it looks like it’s somewhere in the 5-6 trillion range, which given a GDP of ~13 trillion is an odd total.

• Dave

subdividing the millionaires is the deceptive part. If all the millionaires are in one bin, they have far more money than the middle class. Binning the millioniares divides up their income, creating the visual appearance that their total income is less than the middle class.

What do I win?

• schubert malbas

i agree, the middle income tier, which includes probably 20-40% of the population, will always have a grand total that is far larger than the higher income tier, which will only be 1-5% of the population. i tried to calculate in my head using the numbers from the graph above, & I can’t figure everything out. so i will leave the accounting & auditing to the clever ones, and will simply suggest to replace the y-axis with AVERAGE ADJUSTED TAXABLE INCOME (i.e., total taxable income per tier / the number of people in that tier). This will be a better graph to rationalize that a tier does indeed have the most money.

• schubert malbas

of course, that last suggestion i made in jest… the graph makes a valid point for the NYT article, but it deceives and hides the fact that each one in the \$1 M ++ tier make at least a million dollars each.

• Josh

Another thing to point out, this is all based on tax return filings, so there could be missing data.

But regardless of the x-axis scaling, plotting percentages from the spreadsheet tells the same story–the 100k-200k bracket is assessed 22.5% of the total income tax, the combined income tax of all filings under 200k make up 48% of the total income tax, and the combined income tax of the filings over 200k make up 52.1%.

• Bill

Without understanding the argument being made, it is impossible to evaluate the usefulness of the the chart. If for example, the argument was that ‘millionaire’ taxes that are in vogue won’t generate much revenue, then it’s a lot better chart. Given that argument, I find Kevin Drum’s respray for Mother Jones to be far more misleading than the WSJ chart. No one is talking about raising taxes on the people that have most of the income 100-200K. Shame on you for taking a technical question and spinning it in a political direction.

• My suggestion would be to adjust the width of each bar. So the amount of taxes would not be represented by the height, but by the area.

So something like this:
http://michael-kreil.de/temp/taxgraph.png

• Area is generally very difficult for human beings in terms of comparison. So I wouldn’t recommend it, Michael.

• It’s less about comparing value, but the visual impression of who pays the most taxes. I think that this kind of chart gives the most accurate impression.

• You are right; How did you do it ?

• orthodoc

Chait and Drum are being deliberately misleading.

First, here’s the share of income by quintile.

Bottom 20% Households: 3.4%
Next 20%: 8.6
Middle 20%: 14.5
Next 20%: 22.9
Top 20%: 50.5%

For interest: top 5% earns 22%

Keep in mind that richer households are larger — an average of 3.1 people in the top fifth, compared with 2.5 people in the middle fifth and 1.7 in the bottom fifth. So the smaller household almost by definition will make less – because fewer people are working. They can’t have two incomes if there are only 1.7 people….

Now let’s look at percentage of taxes paid:
Bottom 20%: 0.8 (all federal taxes) and -2.8 (income tax only – they got \$ back)
Next 20%: 4.1 and -0.8 (they also got \$ back)
Middle 20%: 9.1 and 4.4
Next 20%: 16.5 and 12.9
Top 20%: 69.3 and 86.3%

This info is from the Census Bureau, the IRS, and the CBO. You are welcome to look it up.

No matter how you slice it, the US tax code is steeply progressive, taxes any dollar that a two-income household earns more heavily than a one-income household, penalizes extra effort, and gives back far more to low income households than it takes.

You are welcome to argue about whether that’s good or bad, but let’s put to rest the tired canard that “the reason the top [fill in the blank] should get taxed is because they have all the money.”

• I plotted the raw data to show the following:
for each income bracket: total income earned, total taxes paid, average tax rate for that bracket. I don’t know what more people think we should take from this.
http://dl.dropbox.com/u/2378180/taxes.png

• Above 5 mill. \$ annual income the tax rate drops.

So, the only difficulty is to earn the first 5 millions. Right ?

• John Macintyre

1. X axis scaling aside, the primary graph as shown is technically correct, but may be misleading, because it is based on adjusted gross income. This means, after deductions have been applied. :-)
2. Kevin Drum’s referenced graph showing total taxable income amounts as a function of total income, presents a very different picture. It suggests, there are a lot of exemptions the very wealthy can apply, that are not used by other income groups.
3. It would also be interesting to plot the total taxable income amounts as a function of say, thousands of people. With that we could see income distribution as a function of numbers of people. And for the record, such plots are not liberal or conservative, it is just factual information. And I always liked John Adams view of facts. “Facts are stubborn things; and whatever may be our wishes, our inclinations, or the dictates of our passions, they cannot alter the state of facts and evidence.” JMac

• Dave Marcus

Another very serious problem that underlies this chart and equally with Kevin Drum’s redo is that it shows where to money is — where the government can go fishing to catch the most fish. It does not show the impact to taxpayers in each bracket. The graph is being used to say that particular tax changes are bad for the middle class. It doesn’t prove or disprove that. The graph posted by r in its comment on May 17, 2011 at 4:44 pm moves in that direction

• The Total taxable income number is an artificial number that is reported to the IRS and created by applying the tax code (or even by fraud.) This does not represent how much the very rich have, just what the current tax system defines as taxable, based on what is reported to them. I would assume the difference between actual income and assets and reported will be far larger at the top end of the income spectrum as there are more loopholes for that bracket and far more for a business owner, which many of those would most likely be.

• Some of my comments would apply to the middle class as well, of course.

• People that make 50-500k have a regular job and significant taxable income. However, they don’t necessarily have the resources to hire accountants and setup elaborate tax schemes to cheat the government. Therefore, this group in general pays what they are supposed to. As one’s annual income starts to approach the 1 million plus category, it becomes possible to hire accountants to fix your taxation problems. The rapid fall off in tax revenue from the million plus groups is due to the fact that there are really fewer people in this bracket, and the people making this kind of money have the resources to make it look like they earned less.

• A bit late to the party, but sometimes you need to sleep on these nagging issues for a few days…

There are two problems here: context, and using the right tool for the job. The context is the question being asked in this debate. Really, the question that the article tries to ask from the visualization is “what percentage of the total wealth comes from the middle class?” A distribution graph is useful at seeing where your data lies, but it’s not really good at answering percentage questions. To answer percentage questions, you need to turn your distribution histogram into a cumulative distribution function (thats why the IRS spreadsheet breaks the data down into the “accumulated” views)

Take a look at this quick dual chart I made from the data: http://i.imgur.com/jwwmU.png

At the top is a CDF of accumulated gross income vs. % of total returns that constitute that accumulated gross income. At the bottom is a chart that maps % of total returns into the incomes that constitute that %.

The question that I’m trying to answer with the chart is “what % of the total wealth is supplied by the ‘middle’ class?” Now, I’ve defined the middle class as returns between \$100-200k, but the beauty of using a CDF is that the chart doesn’t hide anything… you can tweak your definition, follow the same methodology, and arrive at an answer appropriate for your context. No deceptive grouping of buckets.

Now, granted, the concept of a CDF is probably beyond the scope of the average fox news viewer, but you can easily take the ideas here, move a few things around to make them more aesthetically pleasing, add a few sliders and dynamic highlighting, and you might have a visualization that people can understand themselves and play around with to make their own judgements.

• Bin-size aside, this graph’s most fundamental flaw (read “lie”) is that the information presented is irrelevant to the argument being made. The bar-height is in person-dollars, which says nothing about the ability of an individual to shoulder further tax-burden. (If I told you it took 500 man-hours to complete the project, do you know how hard I worked?) I’ll use a simple and absurd case to illustrate. Suppose that we have a million people making a dollar per year and one person making a thousand dollars per year. Chart the data with two bins, and the dollar-a-day bar is *one thousand times taller* than the higher-income bar. Look at all that wealth held by the lower class! Let’s tax it!