• Code For Walmart Growth Visualization Now Available

    October 21, 2008  |  Projects

    It took me three months to do it, but the code to visualize the growth of Walmart is now available under a BSD license (that means free and open like a leaf in the wind):

    Download Walmarts.tar.gz

    I've included the Actionscript and the Walmart openings data, which should be all you need to create your own Walmart growth visualization, or if you're more industrious, some other type of growth in the world. Let me know if you're able to improve upon my code as there's definitely a few areas that wouldn't mind some improvement.

    So go wild, have fun with it, and let me know if you apply the code to another dataset. (I also wouldn't mind if someone wrote some documentation.)

    UPDATE: I am no longer supporting this code.

  • 40 Essential Tools and Resources to Visualize Data

    October 20, 2008  |  Software

    One of the most frequent questions I get is, "What software do you use to visualize data?" A lot of people are excited to play with their data, but don't know how to go about doing it or even start. Here are the tools I use or have used and resources that I own or found helpful for data visualization – starting with organizing the data, to graphs and charts, and lastly, animation and interaction.

    Organizing the Data


    by sleepy sparrow

    Data are hardly ever in the format that you need them to be in. Maybe you got a comma-delimited file and you need it to be in XML; or you got an Excel spreadsheet that needs to go into a MySQL database; or the data are stuck on hundreds of HTML pages and you need to get it all together in one place. Data organization isn't incredibly fun, but it's worth getting to know these tools/languages. The last thing you want is to be restricted by data format.

    PHP

    PHP was the first scripting language I learned that was well-suited for the Web, so I'm pretty comfortable with it. I oftentimes use PHP to get CSV files into some XML format. The function fgetcsv() does just fine. It's also a good hook into a MySQL database or calling API methods.

    RESOURCES:

    Python

    Most computer science types - at least the ones I've worked with - scoff at PHP and opt for Python mostly because Python code is often better structured (as a requirement) and has cooler server-side functions. My favorite Python toy is Beautiful Soup, which is an HTML/XML parser. What does that mean? Beautiful Soup is excellent for screen scraping.

    RESOURCES:

    MySQL

    When I have a lot of data - like on the magnitude of the tends to hundreds of thousands - I use PHP or Python to stick it in a MySQL database. MySQL lets me subset on the data on pretty much any way I please.

    RESOURCES:

    R

    Ah, good old R. It's what statisticians use, and pretty much nobody else. Everyone else has it installed on their computer, but haven't gotten around to learning it. I use R for analysis. Sometimes though, I use it to extract useful subsets from a dataset if the conditions are more complex than those I'd use with MySQL and then export them as CSV files.

    RESOURCES:

    Microsoft Excel

    We all know this one. I use Excel from time to time when my dataset is small or if I'm in a point-and-click mood. Continue Reading

  • Comparative View of Length of Rivers and Height of Mountains

    October 17, 2008  |  Infographics

    I had no idea these comparative views of length of rivers and heights of mountains were so popular - at least in the 1800s. There seemed to be a fascination with placing rivers and mountains next to each other when normally, we're used to seeing them intertwined in a geographic landscape. The above is actually just river lengths, but here's one that places rivers and mountains next to each other.
    Continue Reading

  • New York Times Rolls Out Campaign Finance API

    October 16, 2008  |  Data Sources

    The New York Times announced the opening of their Developer Network a couple of days ago. It's their "API clearinghouse and community." It might seem kind of weird that a newspaper company has an API, but as many FlowingData readers know, the Times prides itself on innovation.

    The Campaign Finance API is currently available:

    With the Campaign Finance API, you can retrieve contribution and expenditure data based on United States Federal Election Commission filings. Campaign finance data is public and is therefore available from a variety of sources, but the developers of the Times API have distilled the data into aggregates that answer most campaign finance questions. Instead of poring over monthly filings or searching a disclosure database, you can use the Times Campaign Finance API to quickly retrieve totals for a particular candidate, see aggregates by ZIP code or state, or get details on a particular donor.

    For anyone who has tried to play with FEC data, myself included, knows that this API is cool. You could get the data directly from the FEC, but it's a bit of a painstaking process. Now you don't have to sift through a bunch of reports or an awkward user interface.

    The Movie Review API is next in line. After that, who knows, but it's a good step forward for The Times.

    [via serial consign]

  • United States Poverty Rates From 1980 to 2007

    October 15, 2008  |  Mapping, Projects

    Thousands of bloggers are taking the time to discuss a single topic today - poverty. As we sit in our cozy homes, go out to eat, watch movies, or simply read the news on a computer, it's easy to forget that there are millions of people around the world who aren't so well off. Blog Action Day is an opportunity to remember and to perhaps help out in some way.

    Mapping Poverty Rates

    I of course took the visualization route. What better way to get the facts than through data? The US Census Bureau provides lots of poverty estimates, so I took their data and mapped it over the last 27 years. I found it alarming to see that some states had a poverty rate over 20%. I clearly live in a cozy bubble. What does your state look like?

  • Visualizing YouTube, Blogs, Twitter, Flickr, People…

    October 14, 2008  |  Network Visualization

    From the guys who brought you 6pli and other like-minded network visualization tools, Bestiario takes 6pli to the next level. 6pli lets users explore their del.icio.us bookmarks. This work, in collaboration with Harvard Berkaman, also lets users explore their del.icio.us bookmarks - as well as YouTube videos, Flickr photos, Twitter tweets, and content from Wikipedia, blogs, and other places. Items are clustered by content type and meta information. Yes, it's a whole lot of stuff in one place.

    The main idea is to take a few steps away from the list and scroll paradigm - sort of like DoodleBuzz, but from a more analytical standpoint. Does it make all those personal streams easier to browse and explore than something like FriendFeed? You be the judge.

    [Thanks, Jose]

  • Browse Political Bias on Memeorandum – Greasemonkey Script

    October 13, 2008  |  Statistical Visualization

    Memeorandum shows up-to-date posts from leading political bloggers, and it is well-known that political bloggers are often very partisan. It's not always obvious to new readers though which side of the line a blogger sits on. You certainly can't always tell just from a headline on Memeorandum. So Andy Baio, with the help of del.icio.us founder, Joshua Schachter, created a Greasemonkey script (and Firefox plugin) to do just that. Simply install the script and browse popular political articles by their bias.

    With the help of del.icio.us founder Joshua Schachter, we used a recommendation algorithm to score every blog on Memeorandum based on their linking activity in the last three months. Then I wrote a Greasemonkey script to pull that information out of Google Spreadsheets, and colorize Memeorandum on-the-fly. Left-leaning blogs are blue and right-leaning blogs are red, with darker colors representing strong biases.

    Just a quick glance at Memeorandum with the plugin installed shows the magic works.

    How it Was Done

    Of course this isn't just magic. It's not human-powered. It's a data-driven algorithm. It's statistics. The data are the articles that the Memeorandum-listed blogs link to, so just imagine a giant matrix with number of links. They then use singular value decomposition (SVD) to reduce that matrix to one dimension which they use to estimate where on the political spectrum any given blog on Memeorandum sits.

    All you statistics readers (and maybe some of the computer scientists) should be familiar with SVD. I learned about it and played with it quite a bit during my first year in graduate school. Anyways, it's cool to see statistics at work and how it can be useful in visualization. A lot of the time visualization projects are about getting all the data on the screen, but with a little bit of know-how (or help from someone who has it) you can produce projects that let the computer do a lot of the pattern-finding work and don't make the user work so hard.

    By the way, Andy's blog Waxy has become one of my favorite blogs as of late, so if political bias isn't your thing, I'd still encourage you to go check it out.

  • Great Data Visualization Tells a Great Story

    October 10, 2008  |  Design

    Think of all the popular data visualization pieces out there - the ones that you always hear in lectures, read about in blogs, and the ones that popped into your head as you were reading this sentence. What do they all have in common? They probably all told a great story. Maybe the story was to convince us of something, compel us to action, enlighten us with new information, or force us to question our own preconceptions. Whatever it is, truly great data visualization reaches us at a very human level and that is why we remember them.

    Let's face it. Data can be boring if you don't know what you're looking for or don't know that there's something to look for in the first place. It's just a mix of numbers and words that mean nothing other than their raw value. The great thing about statistics and data visualization though is that they provide us with the tools to learn that the data are much more than a bucket of numbers. There are stories in that bucket. There's meaning, truth, and beauty. Sometimes the stories will be simple and other times complex. Some will belong in a textbook; others will come in novel form. It's up to the statistician, computer scientist, designer, or analyst to make that decision.
    Continue Reading

  • Daily Design Workout – DONE by Jonas Buntenbruch

    October 9, 2008  |  Data Art

    DONE is a sketching project by Jonas Buntenbruch. He takes 30-60 minutes per day and puts his design skills to work. He began at the beginning of this year on January 1 and has produced a sketch/design for every day so far.

    Some of his work is charts and graphs, but most are of the typography, cartoon, and icon variety. Nevertheless, it's a great way to hone the design skills. You learn what works, what doesn't work, and skills that need sharpening. Learn by doing has always been my philosophy - mostly because I suck at learning by listening, writing, and reading. Seriously. I took a learning test in fourth grade that told me so.

    Can someone please do a data visualization per day? Don't forget to make it awesome.

    [Thanks, Adam]

  • Commercial Air Traffic Seen Around the World

    October 8, 2008  |  Mapping

    Commercial air traffic

    This computer simulation (video below) by Zhaw shows worldwide commercial flights over a 24-hour period. It's been making the blog rounds lately. Watch as flights start in the morning in the western hemisphere, and as the sun starts to come up in the east, more flights begin in the east. I'm not sure if we're seeing actual GPS traces or just interpolated flight paths from point-to-point data, but my guess is the latter. Does anyone understand the language on Zhaw?
    Continue Reading

  • May the Tallest and Fattest Win the Presidency

    October 7, 2008  |  Infographics

    taller

    OPEN N.Y. put together an amusing (and informative) graphic for a New York Times op-chart. It shows the height and weight of presidential candidates dating back to 1896 when William McKinley, weighing in at 5 feet 7 inches, won the election to become 25th president of the United States. The tall lead 17-8 and the heaver lead 18-8. William J. Bryan didn't stand a chance. Will Barack Obama add to the big and tall's lead or will John McCain win one for the little guy?

    [Thanks, Tom]

  • Best of FlowingData: September 2008

    October 6, 2008  |  Best of FlowingData

    September was another good month for FlowingData. We surpassed 5,000 subscribers for the first time - 5,139 to be more precise - and saw more visitors than any other previous month. That's not that much by Internet standards, but by statistician standards, that's usually enough for the Law of Large Numbers to kick in.

    Thank you everyone who continues to spread the word about FlowingData. The blog wouldn't be the same without you.

    In case you missed them, here are the top posts from September.

    1. Winner of the Personal Visualization Project is...
    2. 23 Personal Tools to Learn More About Yourself
    3. Interactive Graph Visualization System - Skyrails
    4. OneGeology Wants to Be Geological Equivalent of Google Maps
    5. See the World Through SimCity's Eyes - One Up On OnionMap
    6. Pie I Have Eaten and Pie I Have Not Eaten
    7. Compare Media Coverage of Presidential Candiates with Everymoment Now
    8. How Consumers Around the World Spend Their Money
    9. Winners of NSF Visualization Challenge 2008 Announced
    10. Beautiful Generative Computer Art - Metamorphosis
  • Highlights from Wired NextFest in Chicago

    October 6, 2008  |  Miscellaneous

    I was in Chicago last week for Wired NextFest – it was impressive, beautiful, engaging, and imaginative. While I had fun presenting some of my own work, it was even more entertaining looking at (and trying out) the other exhibits. Here are some of the highlights of the event.
    Continue Reading

  • Thank You to FlowingData Sponsors

    October 4, 2008  |  Sponsors

    It's been something like a year and a half now since I started FlowingData. It has grown quite a bit since I was talking only to myself. However, with that growth has come greater (financial) responsibilities while I have remained a poor graduate student. Fortunately, I have these two great sponsors to thank for helping this little blog of mine keep running as well as giving me the chance to give back to all you readers.

    Check these groups out. They are doing amazing things with data.

    Eye-Sys - They make scientific visualization doable and emphasize data exploration. Take a look in case studies for the recent Digg example.

    Tableau Software - It's about statistical visualization for Tableau. Analytics is the name and useful visualization is the game.

  • Sketching Around Personal Brand Tracking

    October 3, 2008  |  Design

    Tracking Personal Brand

    This is a guest post by Miguel Jiménez, a user experience and interaction designer based in Madrid.

    There's a lot of noise today around Personal Branding and constructing your own self as a global brand on a certain topic. It makes complete sense to increase your professional value reflecting on others and using the Internet to build up this reputation. It's said that you should start by creating an online identity, supposedly to reflect your Real World™ one, with an entry point in the form of a blog or similar. That's a nice introduction and it’s quite easy to implement, but the main problem to the process of constructing a self-brand is monitoring and tracking how your efforts perform and the next steps you should take. So let's have a conceptual look and sketch around the statistical data found nowadays in the Internet.
    Continue Reading

  • Maps for Advocacy – Beginner’s Guide to Mapping

    October 2, 2008  |  Mapping

    In a follow up to Visualizing Information for Advocacy, the Tactical Technology Collective recently announced Maps for Advocacy: An Introduction to Geographical Mapping Techniques.

    The booklet is an effective guide to using maps in advocacy. The mapping process for advocacy is explained vividly through case studies, descriptions of procedures and methods, a review of data sources as well as a glossary of mapping terminology. Scattered through the booklet are links to websites which afford a glance at a few prolific mapping efforts.

    While the example maps look very Googley and won't impress too many in the online mapping world, there are still some good links in there for data resources, terminology, and how maps play a role in displaying information.

  • We Don’t Know Jack About the World – Alisa Miller TED Talk

    October 1, 2008  |  Mapping

    Alisa Miller, President and CEO of Public Radio International, enlightens us on how little U.S. news coverage there is on the rest of the world. How does she do this? She uses maps of course. Miller uses visualization to tell a (short) story. She shows us all the coverage on Iraq and the lack of coverage on all other countries, which is practically nothing.

    The name of this type of morphed map escapes me right now. Maybe someone can remind me?

    [Thanks, Jodi]

  • 3 Applications that Tap Into the Wisdom of Crowds

    September 30, 2008  |  Social Data Analysis

    crowd

    James Surowiecki writes in The Wisdom of Crowds that the group is smarter than the individual (under four conditions). Essentially, the premise is that if you get enough different people to work on a single problem independently, you're going to get as good or better results than that of a small group of experts working together. Think of it as advanced crowdsourcing.

    These three applications tap into the wisdom of crowds. It's clearly election season.
    Continue Reading

  • If You Could Track Anything, What Would You Track?

    September 29, 2008  |  Discussion, Self-surveillance

    It's about time we had a FlowingData open thread. We've seen that there are plenty of tools to monitor different aspects of our lives, but I'm wondering if they are tools people actually want or if they are tools that are just easy to make. So my question to all of you is:

    If you could track/monitor anything in your life, what would you track?

    Disregard whether or not the technology is there or any of those gross technical details. Assume anything is possible.

    I'll get things started. I want to know how I spend every minute of my life. Not just on the computer. I want to know how much time I spend watching TV, going out, exercising, walking, sitting, driving, waiting, and eating. Everything.

  • Caption Contest Winner is…

    September 26, 2008  |  Contests

    While we're on the subject of contests, lets not forget the epic battle for best caption. Thank you to everyone who participated. All the entries were great and really entertaining, but unfortunately, there could only be one winner. The winner of Stephen Baker's The Numerati is – Mike for his caption (above), "Severity of Crash vs. Length of Ramp." Congratulations, Mike! Expect an email from me soon. (Ricardo, if it's any consolation, my wife liked yours the best :).

    I put a little something together for everyone else. For everyone who entered – this is for you. I hope you all like it. The darker ones are the honorable mentions.


    Click on image for full version.

    Please do let me know if I mistyped or accidentally left anyone out. Thanks again, everyone for participating. I hope you were all entertained as much as me.

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.