How I Made That: Interactive Beeswarm Chart to Compare Distributions

The histogram is my favorite chart type, but it’s unintuitive for many. So I’ve been using the less accurate but less abstract beeswarm.

The How I Made That series describes the process behind a graphic and includes code and data to work with.

Distributions are far more interesting than means and medians, and statisticians most often use histograms to see the former. For me, a statistician, reading histograms is straightforward and unintuitive, but for those who don’t look at distributions on a regular basis, reading histograms can be a challenge.

Histograms require readers have a certain level of statistical knowledge. Readers need a mental image of how the spread works and how individual items or data points can abstract to bars.

To solve this problem, I’ve tried interaction and annotation with histograms on FlowingData a handful of times. It only seems to kind of work. In contrast, the beeswarm chart seems to more easily gain traction. My hunch is that people can relate better to individual shapes moving across a distribution range than they can bars moving up and down. Just a hunch though.

To access this full tutorial and download the source code you must be a member. (If you are already a member, log in here.)

Get instant access to this tutorial and over a hundred more, plus courses, guides, and additional resources.

Membership

You'll get unlimited access to hundreds of hours worth of step-by-step visualization courses and tutorials for insight and presentation — all while supporting an independent site. Source code and data is included so that you can more easily apply what you learn in your own work.

The tutorials are very helpful to move from "Oooo, cool!" to how to actually DO the cool.

Members also recieve a weekly newsletter, The Process, which looks more closely at the tools, the rules, and the guidelines and how they work in practice.

See samples of everything you gain access to:

About the Author

Nathan Yau is a statistician who works primarily with visualization. He earned his PhD in statistics from UCLA, is the author of two best-selling books — Data Points and Visualize This — and runs FlowingData. Introvert. Likes food. Likes beer.