Parallel Sets for categorical data, D3 port

A while back, Robert Kosara and Caroline Ziemkiewicz shared their work on Parallel Sets, a way to visually explore categorical data. Software developer, Jason Davies, just ported the technique to Data-Driven Documents (D3). The interactions for sorting and rearranging are similar to the Kosara and Ziemkiewicz version, but the D3 version of course runs in the browser and has some nifty transitions. Try toggling the show curves box and the icicle plot one.


  • Maybe it is just me, but all those intersections create noise and make it hard to understand. Also, it feels more like a flow, rather than a distribution, which is a little unnatural. I like icicle plot more (treemap – even more?): it is clearer and easier to interpret. Credits to the developer, though, great work in giving concept a try!

  • Matthieu SOLA May 3, 2012 at 8:19 am

    I am new to data visualization (mechanical engineer who sometimes has to play with a lot of data) but I have a hard time reading this and understanding what I can get from it. I don’t feel that I can learn anything quickly except that there are a lot of males who perished in the Titanic and that they were mostly from the crew or third class.

    • Matthieu, Thank Goodness you had the honesty to say what you thought! Parallel Sets are difficult to understand. They are an awkard & inconvenient way to present the data. Really this is reference data and should be in a simple reference table. If they had just left it in a table you’d understand it all within seconds. Data Vis is an emerging field and the quality of visualisations varies. Sadly it seems that many people interested in visualisations allow their critical faculties to fly away when confronted by incomprehensible junk.

  • I have previously seen parallel sets called parallel coordinate plots. I often find them quite useful, though I think the example here is not so good. For me, I guess the main difference is that I use them to visualize continuous data rather than categorical data. If I highlight a range of values in the response variable I can see how the range is connected to the inputs and very quickly visualize correlations and other relationships that numerical statistics just don’t make clear. For the categorical data presented here, though, I think a cross-tabulation would be the better option. Just because you can plot something doesn’t always mean you should….

  • Does displaying the data this way actually enhance my understanding? I can’t seem to figure out what is really going on. The diagram wants my eyes to flow from sex to age to class for no apparent reason, and the various criss-crossing prevents me from understanding anything.

  • IMHO mosaic plots are far more effective for what we are trying to do: visualizing categorical data.



Graphical perception – learn the fundamentals first

Before you dive into the advanced stuff – like just about everything in your life – you have to learn the fundamentals before you know when you can break the rules.

Most popular porn searches, by state

We’ve seen that we can learn from what people search for, through the eyes of Google suggestions: state stereotypes, national …

A Day in the Life of Americans

I wanted to see how daily patterns emerge at the individual level and how a person’s entire day plays out. So I simulated 1,000 of them.

Divorce Rates for Different Groups

We know when people usually get married. We know who never marries. Finally, it’s time to look at the other side: divorce and remarriage.