• Data and Statistics For Human Rights

    Posted to Statistics

    Patrick BallPatrick Ball, a human rights statistician, finds truth in numbers while analyzing and consulting to find patterns and uncover scale in crimes against humanity.

    The tension started in the witness room. "You could feel the stress rolling off the walls in there," Patrick Ball remembers. "I can remember realizing that this is why lawyers wear sport coats – you can't see all the sweat on their arms and back." He was, you could say, a little nervous to be cross-examined by Slobodan Milosevic.

    Mr. Ball was the first expert witness called in the case against the former Serbian president, who was representing himself against mass atrocity charges at the International Criminal Tribunal for Yugoslavia. Ball had spent 10 months crunching numbers about migration patterns in the former Yugoslav province of Kosovo; his findings suggested that hundreds of thousands of refugees who fled to Albania were spurred by the violence of Mr. Milosevic's army. By the time Ball entered the tribunal chamber, in March 2002, the ousted leader had a reputation for grand orations rather than direct questions; when Milosevic veered off track, the judge would interrupt. "Milosevic would say, 'Dobro,' and go on...." Ball remembers. "It means, 'OK, very well,' but it was clearly a, 'Very well, we'll have you shot later.' I hear [that] in my dreams periodically."

    Ball is a statistician – not exactly a profession usually associated with human rights defense. But the Human Rights Data Analysis Group that he heads at Benetech, a technology company with a social justice focus, is bringing the power of quantitative analysis to a field otherwise full of anecdote.

    That's right. Statistics is awesome. I dare you to disagree.

    [via Statistical Modeling]

  • Poverty Statistics that Make Sense – Welcome to Povertyville and Slumtown

    Dan Beech represents worldwide poverty in this video, which is actually a 3-dimensional bar chart with some flare:

    Welcome to Povertyville, Slumtown, and Low Income city. I'm not sure what to think. Should I laugh? Should I cry? I don't know. What do you think?

    In this genre of over-produced graphs, Povertyville reminds me of the real estate roller coaster, a dramatic 3-D time series plot:

  • Write a Guest Post for FlowingData

    Posted to Site News

    Early next month, I'm going to be traveling a bit. I'm headed back to California for about a week for some work-related stuff. Soon after, my wife and I will be celebrating our one-year anniversary on some tropical island where I will be basking in the glory of all-inclusive. The following week, I'll be at the International Summit for Community Wireless Networks.

    I'm going to write posts in advance, but I'd also like to feature some high quality posts from FlowingData readers (like yourself) while I'm gone.

    What I'm Looking For

    I'm pretty open as long as it's within the scope of FlowingData, but here are some ideas I'm interested in finding:

    • Anecdotes on how you use data, statistics, or visualization to discover new things.
    • The design process (from data-culling to final product) from those who are working on or who have worked on data visualization projects.
    • Tips and tutorials on how to tackle certain types of data.

    I'm not looking for heavy promotion of a product (although I don't mind if you mention it). I want to keep the focus on learning and not so much on buying. Also, I'm looking for original content only. I say this just because I want to stay legit with search engines, so please, no duplicate content.

    Email Me Your Post

    To submit a post, send it to me via email. Put "FlowingData Guest Post" in the subject line, and put your post in the actual email or a plain text file. No Microsoft Word documents, and if your post is already with HTML markup, all the better.

    I'm not really sure how many posts to expect, but I'll use as many submissions as possible, if not all of them. My hope is that I'll be able to highlight some more flowing data and as well as help us all learn a thing or two. Looking forward to what you all have in store.

  • Rolling Out Your Own Online Maps and Graphs with HTML/CSS

    Wilson Miner and Paul Smith, two co-founders of Everyblock, post tutorials and a little bit of their own experiences rolling out their own maps and creating graphs with web standards.

    Why Not Go With Google Maps?

    Paul gets into the mechanics of how you can use your own maps discussing the map stack - browser UI, tile cache, map server, and finally, the data. My favorite part though was his reasons for going with their own maps:

    Ask yourself this question: why would you, as a website developer who controls all aspects of your site, from typography to layout, to color palette to photography, to UI functionality, allow a big, alien blob to be plopped down in the middle of your otherwise meticulously designed application? Think about it. You accept whatever colors, fonts, and map layers Google chooses for their map tiles. Sure, you try to rein it back in with custom markers and overlays, but at the root, the core component—the map itself—is out of your hands.

    Because it's so easy to put in Google Maps instead of make your own (although it is getting a little easier), everything starts to look and feel the same and we get stuck in this Google Maps-confined interaction funk. Don't get me wrong. Google Maps does have its uses and it is a great application. I look up directions with it all the time, but we should also keep in mind that there's more to mapping than bubble markers all in the color of the Google flag.

    Remember: a little bit of design goes a long way.

    Data Visualization with Web Standards

    Wilson provides a tutorial for horizontal bar charts and sparklines with nothing but HTML and CSS. Why would you want to do this when you could use some fancy graphing API? Using Everyblock as an example, data visualization can serve as part of a navigation system as opposed to a standalone graphic:

    Everyblock Graphs

    Sometimes the visualization isn't at the center of attention.

    Make sure you check out Everyblock, a site that is all about the data in your very own neighborhood, to see these maps and graphs in action.

    [Thanks, Jodi]

  • Showing the Obama-Clinton Divide in Decision Tree Infographic

    Posted to Infographics

    Amanda Cox, of The New York Times, made another excellent graphic (and I wouldn't expect anything less). We see an entire story between Obama and Clinton - positions taken, counties won, and counties lost. Go ahead and take a look. Words bad. Picture good. Ooga. Booga.

    [via Infographics News]

  • Hierarchical Glossary as Interactive Network Graphs

    Posted to Visualization

    Moritz has been working on visualization of a hierarchical glossary carefully named "Glossary Visualization" versions 2-5. Not sure where version 1 is. Being a network graph, I can see this getting chaotic when there are more words (or categories) involved, but then again, maybe that's all the words. In either case, it beats browsing through words in a dictionary; although, these prototypes don't include definitions yet.

    In the most recent version, words are represented as a DOI tree showing only the categories. Click on a category and view the sub-categories.

    glossary visualization

    All four versions were implemented using the recently-mentioned Flare visualization toolkit.

    What do you think - cluttered or just right?

  • How to Learn Actionscript (Flash) for Data Visualization

    Posted to Software  |  Tags:

    A while back, I asked, "What is the best way to learn Actionscript for data visualization?" As I've had Actionscript staring me in the face for the past two weeks, I can attest to the idea that the best way to learn is by doing i.e. immersing yourself in a project with a deadline looming in the dark behind you. There have been, however, a few things that have made my life a little easier as I strive for coding nirvana.

    My Only Desktop Reference

    Essential Actionscript 3.0I have stacks of books on the floor, in the closet, and on my bookshelf, but there's one book that has stayed within in arm's reach as I learn - Colin Moock's Essential Actionscript 3.0. This is usually the first place I go to look when I'm stuck on a bug or am not sure where to begin. Moock's explanations are very clear and he provides plenty of useful examples without getting too specific.

    When I first started, I read the first section "Actionscript from the Ground Up," which helped me familiarize myself with core concepts like packages, classes, and just the basic ideas of how things work. I feel like one of the hardest parts of learning any programming language is figuring out how all the components talk to each other, so this first section helped a lot. I skimmed the rest of the book, and now it's my only desktop reference.

    I'm also starting to hear great things about Learning ActionScript 3.0: A Beginner's Guide by Shupe and Rosser, but I haven't got to look at it yet.

    Flare Visualization Toolkit

    FlareJeffrey Heer's Flare visualization toolkit seems to come out at just the right time specifically for me. Seriously, the timing couldn't have been better. For instant gratification, go through the tutorial, which covers a few Actionscript basics and straightforward examples for mainly, reading in data and animating and transitioning objects.

    After the tutorial, try to build some of your own visualizations and applying what you learned from the tutorial. Finally, when you're more comfortable, dive into the Flare code to see how things work.

    Modest Maps for Flexible Mapping


    Modest Maps
    For those interested in mapping, Modest Maps has helped me a good bit. From the site:

    Our intent is to provide a minimal, extensible, customizable, and free display library for discriminating designers and developers who want to use interactive maps in their own projects. Modest Maps provides a core set of features in a tight, clean package, with plenty of hooks for additional functionality.

    They're not lying. It provides the basic map functionality like pan and zoom, but it's open, so you can do whatever you want from there. I've been using Flare and Modest Maps together to take the best of both worlds, I guess you could say. There's also the Yahoo! Maps Actionscript API, but I haven't tried it. I don't know if it's as flexible as Modest, but I like the idea of owning all of my code.

    Adobe Flex Builder for Actionscript Development

    Flex Builder 3Flex Builder has been extremely helpful while coding. The name might suggest it's only for Flex projects, but it's pretty darn good for Actionscript projects. The serious Actionscript people I've talked to only seem to use Flex. The other option is to use your text editor of choice and install the free Flex SDK, but it's more complicated (and I've never tried it).

    The downside of Flex is that it's kind of expensive, pricing at just under $250 and even more for the pro version. However, on the flip side, Flex Builder Pro 3 is free to all education customers.

    Last Thoughts

    Finally, let's not forget about Adobe's Actionscript 3.0 language and components reference. In addition to Moock's book, this is the other indispensable resource. And of course there's all the online resources you'll find ala Google.

    This is pretty much what I've been immersed in for the past two weeks. It's definitely a sharp learning curve, but once I got the hang of things, it's been pretty fun and nice to see my data moving along.

    Anyways, I'm just now starting to kick the tires. I am sure there are many of you who have been at this for a while and who know a ton more than I do. What references or resources do you recommend for Flash/Actionscript beginners like myself?

  • Facebook Lexicon – Trends for Writings on the Wall

    Facebook recently released Lexicon which is like a Google Trends or Technorati for wall posts. Type in a word or a group of words, and you can see the buzz for those terms in a time series plot. Daniel sent me this excellent example. Type in party tonight, hangover and you'll get the above graph. Notice the Saturday spikes for party tonight and the Sunday spikes for hangover? Here's another one for finals:

    Facebook Lexicon

    It's interesting to see what people are talking about, and being Facebook walls, there's this realness to the charts (or maybe that's just me).

    Go ahead. Give Lexicon a try. What interesting queries can you find?

    P.S. You have to be logged in to use it.

    [Thanks, Daniel]

  • 3 Rules of Thumb When Designing Visualization

    Posted to Visualization

    Bernard Kerr, the lead designer for del.icio.us, gave an interesting talk (below) focused on remail (mentioned here) and tagorbitals. At the end, he offers three important lessons.

    Reduce Multidimensional Data

    After showing many thread arc versions, Kerr says that when you are dealing with multidimensional data, pick two variables; otherwise, you're going to end up with a big mess. He says this literally, but don't forget that you can also reduce dimensionality with super special and magical statistical methods.

    Use Real Data

    You won't know what you're really dealing with until you have the real data. You can spend lots of time guessing what the data are going to be, but it's the real data that will eventually drive your design. This goes for statistics too. Real data leads to real analysis.

    Try Adobe Illustrator

    Adobe Illustrator offers a javascript interface, so try that out before opening Processing or Flex Builder, and programming through the midnight hours. Illlustrator is of course also good for static mockups and brainstorming. My work flow usually starts with paper and pencil, to Illustrator, and then to the programming. Some people go straight to code, but that's never worked well for me.

    What rules of thumb do you follow?

    Here's the talk in full. It's pretty interesting, if you've got about 25 minutes to spare.

    [via infosthetics]

  • Atheist Statistics For 2008 – Do You Believe These?

    Posted to Mistaken Data

    This video shows statistics centered around atheism, claiming that atheism is correlated with a healthy society. I don't want to turn this into a religious debate, but I really don't like these types of videos, slide shows, etc. It's not the ideas that bother me, but because some people think it's a great idea to rattle off a bunch of numbers to "prove" a point. Nevermind the biases, invalid studies, poor analysis, cruddy data, and "results" taken out of context.

    What do you think? Do you buy this stuff?

  • Data Visualization Blogs You Might Not Know About

    Posted to Visualization

    We all know about information aesthetics, but what other visualization blogs are out there? While writing for FlowingData I've come across some good ones as people send me links (hint) or that I've just randomly found. Here are some of the visualization (and mapping) blogs that I enjoy.

    • Strange Maps - Lots of unique maps from ads, books, papers, etc with very informed commentary.
    • Well-formed Data - Moritz is interested in interface design, visualization, statistics and data mining and is a freelance visualizer.
    • Random Etc. - Tom occasionally updates his blog with thoughts, resources, and, well, random etc.
    • Serial Consign - Greg talks about design and research with some visualization mixed in.
    • AnyGeo - Covers everything geospatial, although I do wish Glenn would switch to full feeds.

    What are some of your favorites that others might not know about?

  • How to Stop Procrastinating – One Month Report

    Posted to Self-surveillance

    Procrastination ClockAbout a month ago, I started my self-experiment to stop procrastinating. I tried these two strategies:

    1. Make a to-do list every night to lay out what will get done the next day
    2. Enable the Greasemonkey script - Invisibility Cloak - which will block all the sites that I waste too much time on except during lunch and on the weekend

    By mid-month, my browsing time was down only a dismal 3.5%. Here's my one month report.
     Continue Reading 

  • Reflecting on Life After Statistics – R.I.P. Minghui Yu

    Posted to Statistics

    Rachel, one of the organizers of Columbia's Life After Statistics, reflects on lessons learned from the conference and gives respects to a fellow statistician who was lost the night of.

    As one of the organizers of the event, Life After a Statistics Doctoral Program (a conference organized by the doctoral students in Columbia's Statistics Department), I was excited to be invited to guest post on Nathan's blog but then realized that my perception of the event would be so different than that of an attendee that perhaps I shouldn't. Two post-docs from Columbia's Statistics department, Matt and Kenny, agreed that they would post and they did -- once on Andrew Gelman's blog and once on Nathan's.
     Continue Reading 

  • H. G. Wells on Quantitative Thinking

    Posted to Quotes

    The time may not be very remote when it will be understood that for complete initiation as an efficient citizen of one of the new great complex world wide states that are now developing, it is as necessary to be able to compute, to think in averages and maxima and minima, as it is now to be able to read and write.

    H.G. Wells, Mankind in the Making, 1904

    [Thanks, Jan]

  • Mapping America’s Most Sinful Cities

    Posted to Mapping

    Forbes, with the help of Mavin Digital, ranked and mapped cities based on the seven deadly sins - lust, gluttony, avarice, sloth, wrath, envy, and pride.

    For each sin we stretched our imagination to find a workable proxy--murder rates for wrath, per capita billionaires for avarice--then culled the available data sources to rank the cities. Some of the results were surprising: Salt Lake City as America's Vainest City. Some were not: Detroit as America's Most Murderous.

    It's always good to remember to take these with a grain of salt, since you don't really know much about the metrics used and how useful these metrics really are. Usually, rankings like these involve a lot of assumptions about the data.

    They are of course still interesting and fun to look at though. Apparently, I moved from one America's most gluttonous cities to one of the most violent and lustful.

    Gluttony

    Lust

  • What Can You Do With a Degree In Statistics? – A Follow Up

    Posted to Statistics

    This past Friday, Columbia University stat graduate students hosted a symposium on careers for students in statistics. Kenneth Shirley, a stat post doc, was nice enough to write this guest post about the conference so that we can all learn from it. There were two panels - academic and industry - including representation from Google, AT & T, and Pfizer.

    Yesterday's conference at Columbia about career opportunities for Statistics Ph.D. graduates was a great success. It was organized by the graduate students in Columbia’s Stats department and advertised on the web here:

    http://www.stat.columbia.edu/career_conf08/

    Andrew Gelman made some opening remarks, and then there were two panel discussions, each with five professional statisticians. The first panel consisted of academic statisticians, and the second panel consisted of industry statisticians. Here are some comments I found interesting.
     Continue Reading 

  • Personal Transactions as a Network Graph Over Time

    Posted to Data Art

    Transactions Graph, by Burak Arikan, is a piece placing personal transactions in network graph. Each node represents a transaction while connections (or edges) shows a relationship between transactions based on time and spending category. The thicker the edge the greater the total of the two connected transactions. Viewers are also able to scroll through time to watch how transactions evolve.
     Continue Reading 

  • Regularities and Patterns Within a Literary Space

    Posted to Data Art

    Stefanie Posavec, maps literary works at the Sheffield Galleries On the Map exhibit. There are several parts to Stefanie's piece mapping sentence length, writing style, and structure. From the looks of things, it looks like the parsing process was manual and involved a lot of highlighting and circling of things. I could be wrong though. For some reason, long and manual labor makes me appreciate things more.
     Continue Reading 

  • Chernoff Faces to Display Baseball Managers From 2007 MLB Season

    Check out this lovely use of Chernoff Faces by Steve Wang of Swarthmore College. This method of visualization was developed by none other than mathematician-statistician-physicist Herman Chernoff in 1973. These faces were designed on the premise that people could easily understand facial expressions. With that in mind, Chernoff used facial characteristics to represent multivariate data.

    If you like, you can make your own Chernoff faces with this R library.

  • 21 (Eco)Visualizations for Energy Consumption Awareness

    Posted to Visualization

    Energy consumption grows more and more concern, and with the popularity of Mr. Gore's An Inconvenient Truth, just about everyone is at the very least, semi-aware of energy consumption. These 21 visualizations and designs were created to increase that awareness, so that maybe, a few more people will turn off the light when they leave a room. I think Peter Crabb said it best (which I borrowed from Tiffany Holmes' ecoviz paper):

    [P]eople do not use energy; they use devices and products. How devices and products are designed determines how we use them, which in turn determines rates of energy depletion.

    Here they are - 21 dashboards, ambient devices, games, and calculators.  Continue Reading