We look up at the starry sky and we sense a fear of not comprehending and being engulfed, a fear of the unknown, and simultaneously we experience a longing for the inaccessible, impenetrable darkness.— Lisa Jevbratt. The Prospect of the Sublime in Data Visualization. 2004.
Speaking of Walmart, if we took all of the Walmarts in the world and clumped them all together, they’d cover Manhattan (with some stores sinking in the water). Walmart is the bottom bubbles; McDonald’s is represented by the second from the bottom set.
I’m slightly surprised that McDonald’s doesn’t cover more. Although, I guess Walmart stores are pretty big compared to McDonald’s restaurants. I’m not really surprised that Walmart area is greater than Manhattan area though. In fact, I thought it would have been more with all the Walmarts in the world. Hmmm…
Into the Artistic Section
As for this graphic, well, if it were supposed to be statistical, I’d say I didn’t I like it. It’s not meant to be statistical though. The goal is to show that Walmart is humungo. I get the graphic’s main point, which is… the point. To that end, I couldn’t care less about proper scales, utility, and what not. Take it for what it is and enjoy.
This Walmart graphic goes in the artistic section of viz, opposed to the pragmatic side (as Robert explains). There are three other graphics similar in feel to this one that cover sugar consumption, student debt, and solar power.
Technology Innovations in Statistics Education (TISE) is a new e-journal that was just announced yesterday. The use of technology (e.g. data visualization) has become extremely important in teaching statistical concepts to newbies, and so this new journal will be really useful; computers have allowed students to explore and experiment in ways students couldn’t do with just paper and pencil. TISE explores these alternatives.
Technology Innovations in Statistics Education (TISE) publishes scholarhip on the intersection between technology and statistics education. The current issue includes papers by George Cobb (who challenges the introductory statistics curriculum to radically innovate to adapt to new technology), Beth Chance et. al, (who provide an overview of the use of technology to improve student learning), Wlliam Finzer, et.al, (who describe software innovations for improving student access to data), Dani Ben-Zvi, (preliminary research results on using Wiki in statistics teaching), Daniel Kaplan (on the role of computation in introductory statistics), and Andee Rubin (an historical overview of technology in statistics education.)
These papers can be read at http://tise.stat.ucla.edu. Please click on the “subscribe button” to join the mailing list to be informed of future released.
TISE is seeking scholarly papers for Volume 2 that address any of these themes:
- Designing technology to improve statistics education
- Using technology to develop conceptual understanding
- Teaching the use of technology to gain insight into and access to data
The first issue is already online. Take a look. I’ve had the opportunity to work with some of the knowledgeable and active members of the editorial board, so TISE looks to be very promising.
Raw, fine-grain data is still a bit hard to come by. Summary statistics (i.e. data that came from some analysis), on the other hand, are often easy to find. A lot of the time the data is already online or just a simple phone call away.
The National Center for Education Statistics, a part of the U.S. Department of Education, offers a bunch of data including, but not at all limited to, poverty and math achievement, average science scores overall and by grade level, and quantitative literacy.
I stumbled across the Social Data Analysis workshop, happening as part of CHI 2008. It is being organized by none other than IBM Visual Communication Lab’s Martin Wattenberg and Fernanda ViÃ©gas in addition to UC Berkeley’s Jeffrey Heer and Maneesh Agrawala.
The goals of this workshop are to:
- Bring together, for the first time, the social data analysis community
- Examine the design of social data analysis sites today
- Discuss the role that visualizations play in social data analysis
- Explore how users are utilizing the various sites that allow them to exchange data-based insights
We seek researchers and practitioners whose work explores social data analysis and/or social uses of visualizations. We hope for a lively mix of people actively involved in building sites and academics who study the dynamics of social software.
The workshop happens during CHI, April 5-10, and you need to submit a 2-4 page position paper by October 31, 2007. Oh and by the way, it’s in Florence, Italy. Not too shabby.
It almost feels like I see a new poll every day for who’s leading in the presidential race. There’s usually a good amount of fluctuation within a single poll with sampling margin of error and then of course the numbers vary across multiple polls. This can be confusing at times, so Pollster put all the results in one scatter plot. Then they stuck a smoother through all the points (for each candidate), and just like that, the viewer gets a general sense of how each candidate has been doing.
Keep in mind that the amount of noise (or bumps in the curve) is going to vary depending on the type of estimation you use, so I wouldn’t place the smaller curves under too much scrutiny. I’m not sure what method Pollster is using, but it’s interesting to see the overall trends. Could we be looking at a double New Yorker election?
Pollster also offers the raw poll data, so in case you want to have some of your own fun, there’s data waiting for you.
[via Mike Love]
Watch Walmart quickly expand like a deadly virus from the movie Outbreak. It’s particularly interesting to see Walmart “infect” an entire small region with multiple new stores opening at the same time in one area. There looks to be somewhere around 30 stores opening per year (rough guess) across the country, so I wonder what the map looks like now. It’s probably all blue except in those deserted Midwest regions. I wonder what the world map looks like.
When I lived in Maine, there wasn’t much to do, so when we were bored on a Friday night, we’d go hang out at Walmart.
That’s kind of sad, but uh, if that’s your thing, well, no, still sad.
When I was in NYC and my wife was in Buffalo, New York we talked on the phone almost every day, usually around ten in the evening. I was at my friend’s place one night, and at 10:05pm, my wife called.
The first thing she said was, “Where are you?”
I told her I was at my friend’s.
My wife quickly replied, “Ha! I knew it!”
Confused, I asked, “How did you know?”
“Because otherwise, you would have called me at exactly 9:58.”
Am I really that predictable? First it was the Chinese food, and now I had been accused of call time predictability. Of course there was only one way to put this dispute to rest — look at the data.
Dear Many Eyes,
From the moment I stared into your thousands of solid black eyes, I knew we had something special. Since the day we met you’ve shown me the silver lining in my data and pointed out details that I never would have found on my own. You’re never pushy or arrogant about it; you always let me learn for myself. You believe in my natural pattern-finding ability the same way I believe in your big, beautiful exploratory tools.
Many Eyes, I want to tell you something. I just want to, well, let you know why you’re so high up on my bookmark list. You should also know there’s some ways that you can improve, but please don’t take it personally. I just want you to be all that you can be.
World Freedom Atlas is an online geo-visualization tool that shows a number of freedom indicators so to speak. For example, you can map by a number of indexes such as raw political rights score, civil liberties, political imprisonment, freedom of religion, freedom of speech, or torture. If I’ve counted correctly the data comes from 42 datasets divided into three categories:
- What It Is
- How To Get It
- What You Get
What It Is covers data such as political rights and civil liberties while How To Get It is data on government structure and education system. I’m not really sure What You Get is though. There’s GDP and some economic indexes, so it could be something like quality of life. Maybe someone can explain it better?
The mapping and plots are pretty standard, but what stands out is the number of datasets that have been formatted in such a way the user is able to map things quickly and easily. It would be really cool if the data was explained a little better, so that I could “browse” the data a bit more efficiently, and even better, if there were some way to compare indicators against each other. Nevertheless, worth exploring a bit.
I just saw Stranger than Fiction. The main character, Harold Crick, spends much of his life counting. He counts the number of steps it takes for him to walk from his home to the bus stop; he brushes his teeth 76 times every morning; he takes a 45.7-minute lunch break and a 4.3-minute coffee break.
So much counting and tracking. Sounds kind of familiar. Maybe a little too familiar? Nah.
71 words. 320 characters. Nine sentences. Wait, now ten. Eleven. Err, twelve…
If you have a blog, I’m sure you’ve heard of the the ever so popular ProBlogger blog. To celebrate, Darren is giving away $54,000 worth of prizes! The current giveaway is for two 20-inch USB monitors, and all you have to do is post about the giveaway (hence this post :). They’re going to have a random drawing some time Friday night. If you don’t need the monitor, there’s a whole lot of other cool stuff being given away in the next few days.
At the Times, I got used to using a super sexy Apple high resolution wide screen to create graphics, but back at home I’ve just been using my laptop and a not-so-hot 1280p 19-inch LCD screen. It’s true what they say about productivity and screen real estate — especially with visualization. I sure wouldn’t mind having these two 20-inch monitors.
What the heck’s a data guy? According to Gerard, who studied computer science and economics in college
It means that I’m the type of person who, instead of planning for a vacation like a normal person, will write a script to pull down airline data for all possible destinations and routes, load the data into R and perform a regression analysis to find the best time to buy.
Oh, so that’s what a data guy is. I guess that makes me a data guy.
This should be good for Swivel, who has seemed to be missing the “data guy” piece of the puzzle. Will Swivel’s visualization tools improve? Will data become more reliable on Swivel? I don’t know. It’s possible. There’s definitely a lot of work to be done, so one person won’t be enough, but hey, it’s a start. It’s not often that I see a computer science / economics person. I’m an electrical engineering and computer science / statistics person myself, and I like to see people with dual backgrounds (even if they did go to the other school across the bay).
That being said, applications like Swivel, Many Eyes, and Data360 make me wonder where all the statisticians are. I see mathematicians, designers, economists, and businessmen. Come on statisticians. Show yourselves. The world needs you.
Anheuser-Busch (Budweiser), Miller, and Coors lead the way in beer. Albeit, this is shipment data, not sales data, so take the numbers with a grain of salt.
The extreme dominance of the top three American beers was somewhat surprising to me, because I never see people order any of those three at restaurants. However, I gave it a few more seconds of thought. I’m thinking parties, sporting events, and drunken nights. The American beers go down easier (because they’re like water), so it’s easier to get drunk. To get drunk, people drink more. So I guess the watery dominance isn’t that surprising. I guess when people buy beer for taste at restaurants, they look to different brands.
Anyhow, I’m really starting to become a fan of these bubble charts. They’re really easy to read and can quickly spruce up a hard-to-read table of numbers. They also seem to scale decently. By well, I don’t mean in like the thousands, but in the tens, I think the bubbles can hold their own.
What kind of beer do you prefer?
UCLA Statistics has a pretty extensive list of resources on how to use R and GRASS. For those unfamiliar, R is a programming language and environment for statistical computing and graphics. GRASS is an open source geographic information system (GIS). And of course, both are completely free and completely useful.
The StatGrad discussion board is now online — a place where stat students can hang out.
One of the things I miss most about going to school is hanging out with my cohort. I work from home in Buffalo, and I get bored and restless pretty easily. When I was at school and feeling restless, I could just go down to the stat lounge, sit on the ridiculous-looking Ikea couch, and relax with some classmates. We never sat around and talked about probability theory or the law of large numbers (ok, maybe we did sometimes), but because we were all stat students, we all had this data-ish way of thinking. Know what I mean?
That’s what I’m hoping for StatGrad. I’m not interested in finding help for specific stat problems or trying to answer R questions. There are plenty of books and online resources for that type of stuff. I’m just hoping that StatGrad can become a place where stat grad students can hang out when they’re bored. Complain about undergrads, discuss anything interesting happening in our field, look for job opportunities, and stay up to date on calls for papers.
Join StatGrad now. I know you want to. Please? Come on, I’m bored.
This venn diagram showing results from tests for Autism really seems to be making its rounds lately. It began with Igor Carron asking on his blog if there was a better way to display the data. Then Andrew Gelman put something of a redesign challenge up on his blog, and after Andrew, the challenge headed on over to Junk Charts. Redesigns are flying off the wall! From bar, to mosaic, to tornado charts, there’s clearly many ways to represent data.
Which one is the best? It’s hard to say, because they all have advantages and disadvantages and the answer really depends on what point you’re trying to drive home.
However, I can find one advantage that the original venn diagram has over its redesigns — it’s intuitive for many people. John Venn introduced his diagram in 1881, over a century ago. That’s a long time for people to adjust. People understand it. It makes sense. Yes, this particular venn is really ugly and probably didn’t belong in a Powerpoint presentation, but doesn’t it say something that re-designers were able to read it and use the data it provided? I think so.
So in the spirit of Indexed, here’s to you Mr. Venn.
Some time last month, Many Eyes introduced their text visualization, the word tree. The user starts from a word or phrase, which is the root (or the trunk?) of the tree and then the branches are the continuation of the sentence in which the word appeared. The advantage over the word tree is that the order of words stays the same, as opposed to a jumbled tag cloud:
Hence, the word tree allows the user to gain a better understanding of text flow and writing patterns than she would with a cloud.
I found that it was very easy to create a word tree with some text that I had uploaded, but while starting exploration, I was unsure about what words to begin with. The word tree interface is similar to Martin Wattenberg’s earlier Baby Name Wizard. The user naturally has some ideas on what to start with since it’s an exploration of names. However, with the word tree, it’s not as obvious, because the user might be exploring a body of text she’s unfamiliar with.
So instead I began sifting with a word cloud, which gave me an idea of some important words and phrases used in the text. Then it was simple to move from the word cloud to the word tree. The two viz tools — cloud and tree — go together quite nicely as the cloud kind of works as a suggestion box for the tree. As a standalone, the word tree is off to a good start.
I’ve never really been interested in baseball. I’ve always been more of a basketball and football fan. However, my summer roommate was a die hard baseball fan, and I’m convinced that he brainwashed me into rooting for the New York Mets. Just a couple of weeks ago, someone told me he was a Phillies fan, and I let out a blech of disgust without even thinking about it.
So with the Mets’ most recent loss, I’m a bit disgruntled, and I’m sure my old roommate is pissed as can be. The Mets are no longer leading the Phillies for the number one spot in the NL east.
What better way to see how poorly the Mets are playing than with a graphic? I decided to compare this year’s Met season with the 1986 Met World Series winning season, because that should probably be what they’re shooting for. As my roommate would angrily exclaim, “If they can’t get their #%&$ act together, they don’t serve to go to the playoffs!”
I saw this map of the average snow levels in Buffalo. I think I just glanced at it and that was about it. When you first look at the map, what do you make of the colors? When I see green for snow levels, I think no snow. Am I crazy? What do you think?
So the image was kind of in my head all this summer while I was in NYC. When I told people that I was going back to Buffalo after my internship, they always gave this look that said, “Ha, have fun during the winter,” and then they would actually say it and then go into how they measure the snow level by comparing it against a giant pole.