The box-and-whisker plot is an exploratory graphic, created by John W. Tukey, used to show the distribution of a dataset (at a glance). Think of the type of data you might use a histogram with, and the box-and-whisker (or box plot, for short) could probably be useful.
The box plot, although very useful, seems to get lost in areas outside of Statistics, but I’m not sure why. It could be that people don’t know about it or maybe are clueless on how to interpret it. In any case, here’s how you read a box plot.
Reading a Box-and-Whisker Plot
Let’s say we ask 2,852 people (and they miraculously all respond) how many hamburgers they’ve consumed in the past week. We’ll sort those responses from least to greatest and then graph them with our box-and-whisker.
Take the top 50% of the group (1,426) who ate more hamburgers; they are represented by everything above the median (the white line). Those in the top 25% of hamburger eating (713) are shown by the top “whisker” and dots. Dots represent those who ate a lot more than normal or a lot less than normal (outliers). If more than one outlier ate the same number of hamburgers, dots are placed side by side.
Find Skews in the Data
The box-and-whisker of course shows you more than just four split groups. You can also see which way the data sways. For example, if there are more people who eat a lot of burgers than eat a few, the median is going to be higher or the top whisker could be longer than the bottom one. Basically, it gives you a good overview of the data’s distribution.
That’s all there is to it, so the next time you’re thinking of making a bar graph or a histogram, think about using Tukey’s beloved box-and-whisker plot too.
Want to learn more about making data graphics? Become a member.