An Easy Way to Make a Treemap

If your data is a hierarchy, a treemap is a good way to show all the values at once and keep the structure in the visual. This is a quick way to make a treemap in R.

Back in 1990, Ben Shneiderman, of the University of Maryland, wanted to visualize what was going on in his always-full hard drive. He wanted to know what was taking up so much space. Given the hierarchical structure of directories and files, he first tried a tree diagram. It got too big too fast to be useful though. Too many nodes. Too many branches.

The treemap was his solution. It’s an area-based visualization where the size of each rectangle represents a metric since made popular by Martin Wattenberg’s Map of the Market and Marcos Weskamp’s newsmap.

Here’s a really easy way to make your own treemap in just a couple lines of code. We’re looking to make something like the above.

Step 0. Download R

Like before, we’re going to use R, so you’ll want to get it before going any further. Download it for Windows, Mac, or Linux. Don’t let the out-dated site full you. You can get a lot done with the free software.

Step 1. Load the Data

We’ll use data covering a hundred popular posts on FlowingData. Here it is in CSV format. You don’t have to download it though. We’ll just load it directly into R. The main thing to take note of is what is there. There’s post id, number of views, number of comments, and category.

Okay, let’s load it into R using read.csv():

data <- read.csv("http://datasets.flowingdata.com/post-data.txt")

Loading data in CSV format into R.

Easy enough. We just used the read.csv() function to load data from a URL. If your data is on your computer, you could also do something like data <- read.csv("post-data.txt"). Just make sure the data file is in your current working directory, which you can change via the “Miscellaneous” menu.

Step 2. Load the Portfolio package

Only a few more lines of code, and you’ve got a treemap. It’s so easy, because we’re going to use the portfolio library in R. First, you have to install it. You can either install the library via the “Package Installer” or you can do it through the command line. Let’s do the latter. Type this in the console to install portfolio:

install.packages("portfolio")

Once installed, load it into R:

library(portfolio)

Step 3. Make the Treemap

It’s time to make the treemap with map.market(). Type this in the console:

map.market(id=data$id, area=data$views, group=data$category, color=data$comments, main="FlowingData Map")

Tada. You should get something like this:

The default treemap uses a red-green color scale.

To sum up, we did this with four lines of code:

data <- read.csv("http://datasets.flowingdata.com/post-data.txt")
install.packages("portfolio")
library(portfolio)
map.market(id=data$id, area=data$views, group=data$category, color=data$comments, main="FlowingData Map")

Step 4. Customize

Now maybe you want to modify something like color. The cool thing about R is that you can see the code for all the functions, edit it, and then use your customized version. If the green and red scheme isn’t for you or you don’t care about the positive/negative cutoff, then you can change the code to do that. I won’t go into detail, but if you type map.market in the console, you’ll see the function. You can change color or cutoff around lines 36-46.

For example, you can do a black and white color scheme:

You don’t have to stick to the default color scale though.

I was alright with the green for this, so I saved it as a PDF and then loaded it into Illustrator as usual. I numbed the green some, cleaned up the labels with a new font and layout, and updated the legend.

Touched up version of treemap with black-green color scale.

And there you go – a treemap with just a few lines of code in our all-trusty R. Rinse and repeat with your own data.

For more examples, guidance, and all-around data goodness like this, order Visualize This, the FlowingData book on visualization, design, and statistics.

Made possible by FlowingData members.
Become a member to support an independent site and learn to make great charts.

See What You Get

About the Author

Nathan Yau is a statistician who works primarily with visualization. He earned his PhD in statistics from UCLA, is the author of two best-selling books — Data Points and Visualize This — and runs FlowingData. Introvert. Likes food. Likes beer.

49 Comments