• How to Make a Heatmap – a Quick and Easy Solution

    Posted to Tutorials  |  Tags: , ,

    The Heatmap

    In case you don't know what a heatmap is, it's basically a table that has colors in place of numbers. Colors correspond to the level of the measurement. Each column can be a different metric like above, or it can be all the same like this one. It's useful for finding highs and lows and sometimes, patterns.

    On to the tutorial.

    Step 0. Download R

    We're going to use R for this. It's a statistical computing language and environment, and it's free. Get it for Windows, Mac, or Linux. It's a simple one-click install for Windows and Mac. I've never tried Linux.

    Did you download and install R? Okay, let's move on.

    Step 1. Load the data

    Like all visualization, you should start with the data. No data? No visualization for you.

    For this tutorial, we'll use NBA basketball statistics from last season that I downloaded from databaseBasketball. I've made it available here as a CSV file. You don't have to download it though. R can do it for you.

    I'm assuming you started R already. You should see a blank window.

    Initial R window when you open it. Exciting, I know.

    Now we'll load the data using read.csv().

    nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv", sep=",")
    

    We've read a CSV file from a URL and specified the field separator as a comma. The data is stored in nba.

    Type nba in the window, and you can see the data.

    What the data looks like when you load it into R

    Step 2. Sort data

    The data is sorted by points per game, greatest to least. Let's make it the other way around so that it's least to greatest.

    nba <- nba[order(nba$PTS),]
    

    We could just as easily chosen to order by assists, blocks, etc.

    Step 3. Prepare data

    As is, the column names match the CSV file's header. That's what we want.

    But we also want to name the rows by player name instead of row number, so type this in the window:

    row.names(nba) <- nba$Name
    

    Now the rows are named by player, and we don't need the first column anymore so we'll get rid of it:

    nba <- nba[,2:20]
    

    Step 4. Prepare data, again

    Are you noticing something here? It's important to note that a lot of visualization involves gathering and preparing data. Rarely, do you get data exactly how you need it, so you should expect to do some data munging before the visuals. Anyways, moving on.

    The data was loaded into a data frame, but it has to be a data matrix to make your heatmap. The difference between a frame and a matrix is not important for this tutorial. You just need to know how to change it.

    nba_matrix <- data.matrix(nba)
    

    Step 5. Make a heatmap

    It's time for the finale. In just one line of code, build the heatmap (remove the line break):

    nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA, col = cm.colors(256), scale="column", margins=c(5,10))
    

    You should get a heatmap that looks something like this:

    Default cyan to purple heatmap

    Step 6. Color selection

    Maybe you want a different color scheme. Just change the argument to col, which is cm.colors(256) in the line of code we just executed. Type ?cm.colors for help on what colors R offers. For example, you could use more heat-looking colors:

    nba_heatmap <- heatmap(nba_matrix, Rowv=NA, Colv=NA, col = heat.colors(256), scale="column", margins=c(5,10))
    

    Changing to heat colors with the col argument

    For the heatmap at the beginning of this post, I used the RColorBrewer library. Really, you can choose any color scheme you want. The col argument accepts any vector of hexidecimal-coded colors.

    Step 7. Clean it up - optional

    If you're using the heatmap to simply see what your data looks like, you can probably stop. But if it's for a report or presentation, you'll probably want to clean it up. You can fuss around with the options in R or you can save the graphic as a PDF and then import it into your favorite illustration software.

    I personally use Adobe Illustrator, but you might prefer Inkscape, the open source (free) solution. Illustrator is kind of expensive, but you can probably find an old version on the cheap. I still use CS2. Adobe's up to CS4 already.

    For the final basketball graphic, I used a blue color scheme from RColorBrewer and then lightened the blue shades, added white border, changed the font, and organized the labels in Illustrator. Voila.

    Updated heatmap in Illustrator with clearer labels and a blue-white color scale

    Rinse and repeat to use with your own data. Have fun heatmapping.

    For more on custom heat maps to visualize your data, check out the members-only tutorial.

  • Data.gov.uk Gearing Up For Launch, er, Does Launch

    Posted to Data Sources, Mapping

    Update: I had scheduled this post for next week, but apparently, Data.gov.uk launched today. The site isn't loading for me right now though. I guess they weren't prepared for traffic.

    Data.gov, a catalog of US data, launched last year. Now it's the UK's turn. Well, not yet. But soon. Data.gov.uk is still under lock and key, but it has granted access to some developers. Ito Labs, the group behind mapping a year of OpenStreetMap edits posted screenshots of their maps that show vehicle counts (above).

    Here are some comparison maps between 2001 and 2008, by vehicle type.
     Continue Reading 

  • The Very First Thematic Maps

    Posted to Mapping

    I'm admittedly not very good with historical precedent, but I think we can all agree it's important to know about the work those have done before us. It makes your own work better and lets you appreciate what others do more (or less).
     Continue Reading 

  • Thanks, FlowingData Sponsors

    Posted to Sponsors

    Thank you, sponsors. I wouldn't be able to do what I do on this blog without you. It seems like FlowingData is growing faster every month, and you guys make that possible.

    Check out what these fine groups have to offer. They help you understand your data:

    Tableau Software – Data exploration and visual analytics in an easy-to-use analysis tool.

    InstantAtlas – Create and present compelling data reports on geographic maps.

    NetCharts – Agile Performance Dashboarding™ for business users.

    Xcelsius Engage – Create insightful and engaging dashboards from any data source with point-and-click ease.

    Business Intelligence – Visual data analysis made easy. Try 30 days for free.

    FusionCharts – Convert all your boring data to stunning charts. Download your free trial now.

    Xcelsius Present – Transform spreadsheets into professional, interactive presentations.

    Email me at nathan [at] flowingdata [dot] com if you'd like to sponsor FlowingData, and I'll send you the details.

  • Crayola Crayon Colors Multiply Like Rabbits

    Posted to Infographics

    In 1903, Crayola had eight colors in its standard package. Today, there are 120, along with special packs like Gem Tones and Silver Swhirls. What happened? Above, from Weather Sealed, shows the growing color selection (and a few color retirements) in the standard package from 1903 to now.

    In 2101, Crayola will hit a color peak and revert to a simpler time. The standard pack will have just two colors: black and Tickle Me Pink (#FC89AC).

    [via Waxy Links]

  • Data Underload #5 – The Portfolio

    Posted to Data Underload
  • Data Visualization Christmas Ornaments

    Posted to Data Art

    It's funny how data is finding it's way into everyday objects. There was jewelry a few months ago and coins last month. Now we've got this experiment with Christmas ornaments from Really Interesting Group (RIG). The snowman's head is sized by the number of followers on Twitter; the (rain) bars represent miles traveled per month on Dopplr; the red shows listening habits on last.fm; and finally, the blue one shows apertures you've used over the year for photos uploaded to Flickr.  Continue Reading 

  • Buy a Print. Support Distaster Relief in Haiti. Please.

    Posted to Mapping, Site News

    Unless you live under a rock inside a cave in the remotest area in the world, you know a huge quake struck Haiti on Tuesday, and much lies in ruins. The New York Times just posted some before and after satellite images, and it's a horrible thing to see. Buildings gone. People gone.

    It pains me to think about what if that were to happen to me or my family.

    To this end, I'm donating all proceeds from World Progress Report orders, along with this month's FlowingData revenues, to UNICEF's relief efforts. The Report, after all, is an effort to relate to the rest of the world. It only seems fitting. It's not much in the grand scheme of things, I guess, but at least it's something. As they say, every little bit counts.

    Again, I'm taking orders for one week - through January 21. Do some good and get something good too. I'm including How America Learns with all orders now. Buy a print now.

    Or if the World Progress Report just isn't your thing, you can donate directly to UNICEF.

    I mean, seriously, there are 27,000 of you + me. We can make a big difference together.

  • Graphical World Progress Report – Now Available

    Posted to Projects

    Want the report? Details at the end on how to get a print. (Update: All proceeds go to UNICEF towards relief effort in Haiti.

    UNdata provides a catalog of 27 United Nations statistical databases and 60 million records about the past, present, and future state of the world. Topics include demographics, life expectancy, labor levels, poverty, and a lot more. What does all that data mean though? World Progress Report, the latest from FlowingPrints, offers a look into the expansive UN collection.

    In whole, the report tells a story of how we live and die, and the stuff in between.
     Continue Reading 

  • Timescapes to Compare Chopin Recordings

    How do you compare music visually? You can break it down into data by quantifying the notes, volume, etc and then visualize it with timescapes (above). The horizontal axis represents musical time, from the beginning to end of a piece. Large blocks show similarities to other pieces and smaller noisy chunks show more "fleeting" similarities.
     Continue Reading 

  • Data Underload #4 – Little Things

    Posted to Data Underload
  • The Geography of Netflix Rentals

    Posted to Mapping

    Some movies are popular everywhere. Others are popular only in certain regions. The New York Times, in a nice team effort, maps rental popularity by zip code for large regions in the US.
     Continue Reading 

  • Need to Escape Jupiter’s Gravitational Pull? Good Luck

    Posted to Infographics

    Randall of xkcd has been having fun with data visualization lately. In his latest data-ish comic, Randall explores gravity wells. The height of each well is sized relative to the amount of energy (on Earth) it would take to escape that planet's gravity. The width of wells are scaled by planet size.

    So you'd need one big arse rocket to escape Jupiter.

    I know it's a comic, hand-drawn, and all stick-figurey and stuff, but Randall actually explains the concepts really well. There's good annotation, clear examples, and he's made an obscure topic easy to understand.

    It's also entertaining in the Bill Nye the Science Guy (i.e. best Saturday morning show ever) sort of way.

    [Thanks, Ricki and Thomas]

  • Graphical World Progress Report – A Sneak Peek

    Posted to Projects

    FYI: A new edition on the current state of the world is coming soon from FlowingPrints. Join the mailing list to be first to know when it's available. I'm only going to take orders for one week this time around, so please please make sure you sign up. More info coming next week.
     Continue Reading 

  • 11 Ways to Visualize Changes Over Time – A Guide

    Posted to Design  |  Tags:

    Deal with data? No doubt you've come across the time-based variety. The visualization you use to explore and display that data changes depending on what you're after and data types. Maybe you're looking for increases and decreases, or maybe seasonal patterns.

    This is a guide to help you figure out what type of visualization to use to see that stuff.
     Continue Reading 

  • Even Older Infographics from the 19th Century

    Posted to Infographics

    Old graphics are awesome. We saw some from the 1930s already. These are even older.

    Other than the maps, I don't exactly know what I'm looking at (knowing French would help too), but who cares? Mmm, hand-drawn goodness.
     Continue Reading 

  • A Visual History of Loudness in Popular Music

    Posted to Infographics

    All Things Considered discusses why music sounds worse than it did a few decades ago. Through a practice using compressors, the quiet parts of a song are made louder and the louder parts quieter so that the song as a whole sounds louder to your ear. The purpose: to make the song stand out when you hear it on the radio.

    As a result, tracks have gotten louder over the years.
     Continue Reading 

  • Data Underload #3 – The Resolution Cycle

    Posted to Data Underload
  • The Universe as We Know It

    Posted to Mapping

    The Known Universe from the American Museum of Natural History shows a view of the universe, starting from the Himalayas and quickly moving out to the edge where all is black and scary - made possible by the records in the Digital Universe Atlas.
     Continue Reading 

  • Charting the Decade

    Posted to Infographics

    Did we all see this? Phillip Niemeyer of Double Triple pictures the past ten years in this Op-Chart for The New York Times. Each row is a theme, and each column represents a year. For example, the champion rep for 2007 is Tiger Woods or collagen as the fad of 2002. Oh how times change.

    Have a happy new year everyone. Be safe.

    [via WeLoveDataVis]