How to Make a US County Thematic Map Using Free Tools

By Nathan Yau

Unemployment choropleth

There are about a million ways to make a choropleth map. The problem is that a lot of solutions require expensive software or have a high learning curve. It doesn't have to be that way.

What if you just want a simple map without all the GIS stuff? In this post, I'll show you how to make a county-specific choropleth map using only free tools.

Here's what we're after. It's an unemployment map from 2009.

Step 0. System requirements

This tutorial was written with Python 2.5 and Beautiful Soup 3. If you're using a more recent version of either, you might have to modify the code. See comments below for tips.Just as a heads up, you'll need Python installed on your computer. Python comes pre-installed on the Mac. I'm not sure about Windows. If you're on Linux, well, I'm sure you're a big enough nerd to already be fluent in Python.

We're going to make good use of the Python library Beautiful Soup, so you'll need that too. It's a super easy, super useful HTML/XML parser that you should come to know and love.

Step 1. Prepare county-specific data

The first step of every visualization is to get the data. You can't do anything without it. In this example we're going to use county-level unemployment data from the Bureau of Labor Statistics. However, you have to go through FTP to get the most recent numbers, so to save some time, download the comma-separated (CSV) file here.

It was originally an Excel file. All I did was remove the headers and save as a CSV.

Step 2. Get the blank map

Luckily, we don't have to start from scratch. We can get a blank USA counties map from Wikimedia Commons. The page links to the map in four sizes in PNG format and then one as SVG (USA_Counties_with_FIPS_and_names.svg‎). We want the SVG one. Download the SVG file onto your computer and save it as counties.svg.

Blank US counties map in SVG formatusa-counties

The important thing here, if you're not familiar with SVG (which stands for scalable vector graphics), is that it's actually an XML file. It's text with tags, and you can edit it in a text editor like you would a HTML file. The browser or image viewer reads the XML. The XML tells the browser what to show.

Anyways, we've downloaded our SVG map. Let's move on to the next step.

Step 3. Open the SVG file in a text editor

I want to make sure we're clear on what we're editing. Like I said in Step 2, our SVG map is simply an XML file. We're not doing any photoshop or image-editing. We're editing an XML file. Open up the SVG file in a text editor so that we can see what we're dealing with.

You should see something like this:

SVG is just XML that you can change in a text editor.svg-shot-top

We don't care so much about the beginning of the SVG file, other than the width and height variables, but we'll get back to that later.

Scroll down some more, and we'll get into the meat of the map:

The path tags contain the geographies of each county.svg-shot-middle

Each path is a county. The long run of numbers are the coordinates for the county's boundary lines. We're not going to fuss with those numbers.

We only care about the beginning and very end of the path tag. We're going to change the style attribute, namely fill color. We want the darkness of fill to correspond to the unemployment rate in each given county.

We could change each one manually, but there are over 3,000 counties. That would take too long. Instead we'll use Beautiful Soup, an XML parsing Python library, to change colors accordingly.

Each path also has an id, which is actually something called a FIPS code. FIPS stands for Federal Information Processing Standard. Every county has a unique FIPS code, and it's how we are going to associate each path with our unemployment data.

Step 4. Create Python script

Open a blank file in the same directory as the SVG map and unemployment data. Save it as color_map.py.

Step 5. Import necessary modules

Our script is going to do a few things. The first is read in our CSV file of unemployment data. So we'll import the csv module in Python. We're also going to use Beautiful Soup later, so let's import that too.

import csv
from BeautifulSoup import BeautifulSoup

Step 6. Read in unemployment data with Python

Now let's read in the data.

# Read in unemployment rates
unemployment = {}
reader = csv.reader(open('unemployment09.csv'), delimiter=",")
for row in reader:
    try:
        full_fips = row[1] + row[2]
        rate = float( row[8].strip() )
        unemployment[full_fips] = rate
    except:
        pass

We read in the data with csv.reader() and then iterate through each row in the CSV file. The FIPS code is split up in the CSV by state code (second column) and then county code (third column). We put the two together for the full FIPS county code, making a five digit number.

Rate is the ninth column. We convert it to a float since it's initially a string when we read it from the CSV.

The rate is then stored in the unemployment dictionary with the full_fips as key.

Cool. The data is in. Now let's load the SVG map, which remember, is an XML file.

Step 7. Load county map

Loading the map is straightforward. It's just one line of code.

# Load the SVG map
svg = open('counties.svg', 'r').read()

The entire string is stored in svg.

Step 8. Parse it with Beautiful Soup

Loading svg into Beautiful Soup is also straightforward.

# Load into Beautiful Soup
soup = BeautifulSoup(svg, selfClosingTags=['defs','sodipodi:namedview'])

Step 9. Find all the counties in the SVG

Beautiful Soup has a nifty findAll() function that we can use to find all the counties in our SVG file.

# Find counties
paths = soup.findAll('path')

All paths are stored in the paths array.

Step 10. Decide what colors to use for map

There are plenty of color schemes to choose from, but if you don't want to think about it, give the ColorBrewer a whirl. It's a tool to help you decide your colors. For this particular map, I chose the PurpleRed scheme with six data classes.

ColorBrewer interface for easy, straightforward way to pick colorscolorbrewer

In the bottom, left-hand corner, are our color codes. Select the hexadecimal option (HEX), and then create an array of those hexadecimal colors.

# Map colors
colors = ["#F1EEF6", "#D4B9DA", "#C994C7", "#DF65B0", "#DD1C77", "#980043"]

Step 11. Prepare style for paths

We're getting close to the climax. Like I said earlier, we're going to change the style attribute for each path in the SVG. We're just interested in fill color, but to make things easier we're going to replace the entire style instead of parsing to replace only the color.

# County style
'font-size:12px;fill-rule:nonzero;stroke:#FFFFFF;stroke-opacity:1;
stroke-width:0.1;stroke-miterlimit:4;stroke-dasharray:none;stroke-linecap:butt;
marker-start:none;stroke-linejoin:bevel;fill:'

Everything is the same as the previous style except we moved fill to the end and left the value blank. We're going to fill that in just a second. We also changed stroke to #FFFFFF to make county borders white. We didn't have to leave that value blank, because we want all borders to be white while fill depends on unemployment rate.

Step 12. Change color of counties

We're ready to change colors now! Loop through all the paths, find the unemployment rate from the unemployment dictionary, and then select color class accordingly. Here's the code:

# Color the counties based on unemployment rate
for p in paths:
    
    if p['id'] not in ["State_Lines", "separator"]:
        # pass
        try:
            rate = unemployment[p['id']]
        except:
            continue
            
        if rate > 10:
            color_class = 5
        elif rate > 8:
            color_class = 4
        elif rate > 6:
            color_class = 3
        elif rate > 4:
            color_class = 2
        elif rate > 2:
            color_class = 1
        else:
            color_class = 0

        color = colors[color_class]
        p['style'] = path_style + color

Notice the if statement. I don't want to change the style of state lines or the line that separates Hawaii and Alaska from the rest of the states.

I also hard-coded the conditions to decide the color class because I knew beforehand what the distribution is like. If however, you didn't know the distribution, you could use something like this: float(len(colors)-1) * float(rate - min_value) / float(max_value - min_value).

Step 13. Output modified map

Almost done. We just need to output the newly colored SVG map.

# Output map
print soup.prettify()

Save your Python script. For the sake of completeness, here's what your Python script should now look like:

### color_map.py

import csv
from BeautifulSoup import BeautifulSoup

# Read in unemployment rates
unemployment = {}
min_value = 100; max_value = 0
reader = csv.reader(open('unemployment09.csv'), delimiter=",")
for row in reader:
    try:
        full_fips = row[1] + row[2]
        rate = float( row[8].strip() )
        unemployment[full_fips] = rate
    except:
        pass


# Load the SVG map
svg = open('counties.svg', 'r').read()

# Load into Beautiful Soup
soup = BeautifulSoup(svg, selfClosingTags=['defs','sodipodi:namedview'])

# Find counties
paths = soup.findAll('path')

# Map colors
colors = ["#F1EEF6", "#D4B9DA", "#C994C7", "#DF65B0", "#DD1C77", "#980043"]

# County style
path_style = 'font-size:12px;fill-rule:nonzero;stroke:#FFFFFF;stroke-opacity:1;
stroke-width:0.1;stroke-miterlimit:4;stroke-dasharray:none;stroke-linecap:butt;
marker-start:none;stroke-linejoin:bevel;fill:'

# Color the counties based on unemployment rate
for p in paths:
    
    if p['id'] not in ["State_Lines", "separator"]:
        try:
            rate = unemployment[p['id']]
        except:
            continue
            
        
        if rate > 10:
            color_class = 5
        elif rate > 8:
            color_class = 4
        elif rate > 6:
            color_class = 3
        elif rate > 4:
            color_class = 2
        elif rate > 2:
            color_class = 1
        else:
            color_class = 0


        color = colors[color_class]
        p['style'] = path_style + color

print soup.prettify()

Step 14. Run script and save new map

Now all we have to do is run our script and save the output.

Running script in OS X Terminalterminal-out

We're done! Open your SVG in FIrefox or Safari, and you should see a nicely colored map similar to the one below.

choropleth06

Oh wait, there's one teeny little thing. The state borders are still dark grey. We can make those white by editing the the SVG file manually.

We open our new SVG in a text editor, and change the stroke to #FFFFFF from #221e1f around line 15780. Do something similar on line 15785 for the separator. Okay. Now we're done.

Where to Go From Here

While this tutorial was focused on unemployment data, I tried to keep it general enough so that you could apply it to other datasets. All you need are data with FIPS codes, and it should be fairly straightforward to hack the above script.

You can also load the SVG into Adobe Illustrator or your favorite open source vector art software and edit the map from there, which is what I did for the final graphic.

So go ahead. Give it a try. Have fun.

For more examples, guidance, and all-around data goodness like this, order Visualize This, the FlowingData book on visualization, design, and statistics.

Become a FlowingData member, and get instant access to tutorials and resources.

Membership

This is for people who want to learn to make and design data graphics. Your support goes directly to FlowingData, an independently run site. Join now for instant access.

What you get

  • Instant access to tutorials on how to make and design data graphics
  • Source code and files to use with your own data
  • Hand-picked links and resources from around the web

111 Comments

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.