• Admittedly, ever since the Spring quarter ended, I’ve either been preparing for my internship at The Times or have been occupied by the internship. I haven’t given much thought to my dissertation topic, which in the most vaguest of terms will somehow encompass three things:

    • Social Data Visualization
    • Eco-Visualization
    • Visualization of my Life

    I have yet to figure out how to tie the three together in a worthwhile way or even whether I will include all three. Wrapped around the three will be data sharing. I got to thinking a little bit about visualizing my life in data today.

    My adviser forwarded me this info design piece, by Gregory Dizzia (which was apparently also featured on infosthetics):

    Greg’s Relationships

    First off, this is a cool piece. If you haven’t seen it, go to the site and download the pdf. It’s a simple idea. Document past relationships — how they began, how they ended, what happened in between. The information is organized very well. At a glance, you can see how many relationships Greg has had in his life and all the one night stands he had after his mid-life, long-term relationship. The design is attractive and I could relate to the information, so I was drawn in to look more.

    Dig a little deeper, and you’ll see that there’s not just one engagement ring during that long-term relationship with Sarah. There’s a second one during his very first girlfriend, Megan. Although, I’m a little wary of calling Megan a girlfriend since it was during Greg’s tender years at age 9 to 11. Stuff like that makes me want to know more.

    Was he really engaged? Was it an arranged marriage or something? What do those breakup symbols really mean?

    Life Visualization Appeal

    Right off, Greg’s piece drew me in, because (1) it was pretty, and (2) I could relate to the data, and (3) there was a very human factor. This could probably be generalized to all types of successful visualization, but (2) and (3) are, I think, synonymous with life viz. That’s two out of three things that are automatic. Plus, as the visualizer I have a very strong emotional attachment to the data.

    NOW, what happens when we have 100 people’s relationships to visualize? 1000? That’s when it gets really interesting and social data visualization makes its way into the picture. Well, something to think about.

  • Tired of looking at my New York Times graphics yet? Too bad. Here’s another one for my your viewing pleasure.

    CUNY SAT Math Graph

    CUNY schools are planning to raise their SAT math scores to 510 for their top-tier schools and to 500 for the rest. Believe it or not, the current cutoff for all schools is 480. Some say the increase in standards is good for the school to improve reputability. Others argue that the new cutoffs single out a lot of minorities since the high school education system is uneven.

    Currently, lots of students are coming into CUNY schools unprepared to take college-level math courses, and the college ends up teaching remedial courses like pre-algebra. That’s just SAD. It’s probably more important to focus on improving the high school education system than it is to try to get unqualified students into college.

  • I had a chance to browse through some of my subscribed feeds today, and I saw a post called Noisy Subways by Kaiser over at Junk Charts blog. So I clicked, since it isn’t one of those full feeds, and then I saw The New York subway report card. I smiled, because, well, I made that chart just a few days ago!

    Just a disclaimer: The Times chart was just The New York Times version of the original Straphangers report:

    Straphanger Subway Report Card

    Anyways, there was bit of a discussion, which again, I found very amusing. I felt kind of special in a way.

    There were two main points to the post – 1. Noisy data; and 2. Chart is hard to read. I’m very tired right now, so I’ll just say a few things.

    Yes, the data is really noisy, but why shouldn’t it be? We shouldn’t assume that all six variables are positively correlated. It’s very possible for a line to be very reliable, but have no seats. One could argue that the lines with more people HAVE to be more reliable, because if something goes wrong, more people are going to get screwed.

    Secondly – sure, the chart is a bit hard to read at a glance, but who’s the audience? New Yorkers are the audience, and the first thing that they’re going to do is look for their subway line. That’s what I did. With the audience in mind, I think the chart serves its purpose.

    Most of the commenters provided decent ideas for alternative graphics. My opinion is that with this kind of data, it’s up for grabs. Audience is key though for charts, graphs, plots, maps, etc in a newspaper. Spiders and whiskers won’t make sense to many people. You’d be amazed of how many people don’t know how to read a scatter plot. The public is getting better though. They’ll get there.

    As for the person who left the comment about the gaps in the chart. I’m going to assume that was in haste. Some lines are tied, hence some blanks spaces.

    Welp, that was fun. Yawwwwn. Time for bed.

  • My second graphic was in The Times Metro section today (Tuesday, July 24, pg B2). It’s an annual report card compiled by the Straphangers Campaign for every New York subway line. The No. 1 line was coincidentally ranked best while the C and the W (one of the lines I take) were near the bottom.

  • Google Reader Trends

    This is just really amusing to me. Above is a bar plot, from Google Reader, of the number of items I’ve read in the past 30 days, with each bar representing a day. Quite easy to see when I had a little bit too much time on my hands. Right when the internship starts, the number of items read plummets. I miss my subscribed feeds =(.

  • It’s six days in, and I’m starting to get used to Adobe Illustrator. It’s one honker of a program, so I’m picking up things as I go along, but on the upside, I’m really glad I went through some of Illustrator lessons to at least familiarize myself with layers, etc.

    I think I’m getting closer to the point where it’s less about “How do I do this?” and more about “What am I going to show?” Don’t get me wrong. There’s A LOT I still don’t know how to do, but at least I know enough to figure out a good amount on my own. Just a lot to figure out about The Times graphics style — font, sizes, color, etc.

    The administrative stuff is the hardest part of all though. While I’m working on a graphic I have to keep all the necessary people updated i.e. the reporter of the story of whom I am making the graphic for. I got scolded today, because a reporter didn’t know that her story was put on hold. I didn’t know that I was her only contact link. Lesson learned. I’m just going to contact everyone from now on. Better to provide too much information than too little (in this case, at least).

    Once a graphic is completed, I have to print out five copies or so and hand them out to all of the necessary people. Next, update the graphic schedule, and then place it in the active list. It’s strange that even though we’re all equipped with these super awesome computers that I still have to walk upstairs and hand-deliver copies of a graphic. I guess nothing can replace human contact.

  • Net purchase of U.S. bonds and stocks by foreign investors

    and here it is. Floyd Norris has a weekly editorial called Off the Charts. This week’s was titled A Blockbuster Seller Overseas: Stakes in Corporate America. It’s about the increase in the amount of money foreign investors are putting into American businesses. Check out the Business section in The New York Times! There’s also an online version.

    It was ridiculous how many changes had to be made to my first pass at the graphic. I suck (but I’m getting better, really!). Color, bar width, grid style, font size, axis size, alignment, plus and minus signs, spacing, and area fill.

    While writing statistical reports for class, I think it was easy to get away with so-so graphics. Just plug some data into R, use the plot function, and ta-da. I’ll never look at charts and graphs the same way again.

  • Every day I learn a lot, and every day I get better. For most of the day today, I worked on a single graphic (that hopefully runs in the paper). I gave it to the person in charged, and oh man, there was a lot to change. Fonts, labels, fill colors, bar widths, spacing, layer orientation, size… on and on and on. I think it might have been faster for him to make the graphic himself than it was for him to fix mine.

    Sigh. Gotta practice.

    The graphic above is the number of daily steps I’ve taken since I started wearing a pedometer. Can you tell when I moved to the city and was forced to walk to the subway and work?

  • So I started on a graphic today, and it might actually be in the paper. I’m pretty excited to see my first graphic published. I won’t say what it is or when until I actually find out if it gets published to save myself from any embarrassment, but nevertheless, cool to think about.

    In other news, I got to see a coworker do his stuff with some mapping and what not, whizzing through Adobe Illustrator like it was part of him. His attention to detail and his ability to do it quickly were what impressed me the most. I have a lot to learn and a lot to do.

    It’s also comforting to know that previous interns didn’t know any Adobe Illustrator either, so I don’t feel as dumb anymore.

  • Ok, so I said I would update after my first day, but I got home around 8p, and was just too tired to do anything. I forgot what it’s like to continuously work for 8 hours a day. It’s 9:30p right now, and honestly, I’m ready to pass out.

    My first two days have been interesting, and it’s very clear that I have a lot to learn, in terms of visualizing graphics. I have yet to create an actual graphic. Rather I’ve been sifting through Iraq data, looking for stuff that’s interesting, and then putting it in a spreadsheet. I was also given an article to read that could possibly benefit from a graphic, but it turns out that the writer already got some amazing photos and a small map, so that was a no-go.

    I think my strength is R, so I’m going to try to improve (um, learn) mapping in R. I think if I can do that, I will be of much more help. I was asked today if I could do maps, but unfortunately, I’ve only done very basic things, and the task was time-sensitive. Sigh.

    Am I rambling? I feel like I’m rambling. I need to sleep. But I need to learn R. Alright then.

    On a completely random note, I was able to see Matthew Carter, a master font designer, who has designed fonts like Georgia and Verdana, speak today. I never really put much thought into type faces, but wow, there’s a whole lot that goes into it. A lot of subtleties that involve making more text fit on a page without cluttering or techniques to make text easier to read on a newspaper or from a magazine.

    Oh yeah, I also saw my idols — the IBM Visual Communications lab. I didn’t get a chance to talk to them though, but they looked like a friendly bunch.

  • In honor of my New York Times induction day, a visualization of The Wealthiest Americans Ever. You think good ol’ Billy would be there at the top of the list, worth $82 billion, there have been a few who have preceded the software giant e.g. John D. Rockefeller worth a crazy $192 billion. Just think how many Jack in the Box tacos you could buy with that kind of money.

  • Two months flew by in a hurry in between Spring quarter ending and my New York Times internship beginning. I arrived in New York City today and I’m starting at the times tomorrow. For the next 10 weeks, I’m going to be working in the graphics department, and I anticipate it’s going to have to be a pretty steep learning curve.

    I don’t really know what I’m going to be doing yet, so I’ve familiarized myself with Adobe Illustrator and Flash and tried to pick up a few more skills in R. Originally, I was thought I was going to become some kind of expert in Illustrator and/or Flash, but after some time with the books, I’ve got a long a road ahead. Not being so hot in either Illustrator or Flash, I’m a little nervous, but I guess I’ll just wait and see.

    Updates on my first day tomorrow.

  • Yes, more mapping. Map, map, map. amMap offers a Flash-based mapping tool that you can download and customize to your liking.

    Ammap is an interactive flash map creation software. Use this tool to show locations of your offices, routes of your journeys, create your distributor map. Photos or illustrations can be used instead of maps, so you can make different presentations, e-learning tools and more.

    There’s some smooth browsing and zooming, and it’s pretty sleek. Those who appreciate simplicity will appreciate amMap. Plus, it’s free :) Read More

  • In her TED talk, Emily Oster challenges our conception of AIDS and suggests other covariates that we need to look at (e.g. export volumes of coffee). Until we get out of the mindset that poverty and health care are the only causes/predictors of AIDS, we won’t be able to find the best way to fight the disease. Another great use of data.

    I do have one small itch to scratch though. Emily had a line plot that shows export volumes and another line, on the same grid, of HIV infections, both over time. It reminds me of the plots that Al Gore uses with carbon dioxide levels and temperature. Anyways, using the plot, Emily suggests a very tight relationship between export volumes and HIV infections. Isn’t export volume pretty tightly knit to poverty? I don’t know. She’s the economist, so she would know (A LOT) better than me. I guess I just wish she talked a little bit about the new and different data she has that compels us to change our conceptions.

  • Gas Prices over TimeWhile on the subject of gas prices, Foreign Policy has a graph of the prices per gallon of gasoline from 2000 to 2006. With the US at the lower tier, I feel like a bit of a whiner (“Waa waa waa, it costs 30 dollars to fill my tank”). At the lower end, it seems Venezuela seems the place to be, with some major government subsidizing going on.


  • A very simple graph from The Economist (spiced up a bit with a picture of a delicious gasoline droplet) that quickly gets its point across. The United States uses a lot of petrol compared to other countries, while at the same time, it costs less to fill up a Honda Civic in the US than most other places.

    However, the left graph is based on 2003 data. I wonder what the graph looks like now? Similar, I’m sure, but still something to look at.

    Anyways, something really interesting here — even though Venezuela has crazy low gas prices, the average petrol consumption per day over there is still quite low. Whether this is a cultural thing or just some weird supply and demand thing (that I have no clue about) might be worth some investigating.

    In any case, just because we have lower gas prices (that we still complain about) than a lot of the world, we’re still consuming a lot. What’s our excuse?

  • As Jon Udell has mentioned, there’s a ton of data online, but it’s not often we can find it, often hidden in the deep, dark basement of some website. He has proposed that people book mark public datasets on del.icio.us under the tag “publicdata”. I think this is a great idea. In turn, you can subscribe to the feed with the url http://del.icio.us/tag/publicdata.

    I’ve been doing this already for a while, but I had been just tagging with “data”. So I’m going to join in on the party and start tagging with publicdata, and I hope others will too. Until sites like Many Eyes and Swivel get more wind beneath their wings, I think it’s necessary.

  • Ratatouille Visualization

    By now, I’m sure everyone has heard of Pixar’s most recent movie, Ratatouille. If you haven’t seen it, I HIGHLY recommend it. Not only is it beautiful animation and a nice story, but it’s about food. I love Pixar. There are a few scenes in the movie when the main character, Remy, and his brother, Emile, are eating and experiencing the taste of some exquisite cheese.

    There was pretty taste visualization going on done by Michel Gagne.

    Around 1400 drawings were created for the animation. Each one was scanned, painted and composited using two softwares: Animo and Photoshop.

    That’s a lot of hand drawings, but quite nice results. Good job, Michel.

  • PedometerIt’s really easy to be lazy when you work from home. I can tell you this first-hand.

    Twenty-six steps from my bedroom to the kitchen; 6 steps from bedroom to study room; 29 steps from study room to kitchen; 24 steps from kitchen to bathroom. Do some back and forth, go through the rotation a few times, and that’s my day. I can easily go a whole day walking (or dragging my feet) only 300 steps. That’s sad.

    Just how sad is it? The Walking Site (um, yes, there really is a walking site :) recommends 10,000 steps per day. Wow, only 9,700 steps away! I’m pretty sure I’m slowly getting fatter due to my sloth-like behavior.

    In efforts to avoid the gut, I’ll be wearing my trusty pedometer to shoot for 10,000 steps per day. Of course I’ll be logging this data online, and we can all see how un-lazy I can become. Who knows?

    I can tell you this though. I used to wear this nifty step counter a few months back, and it certainly made me more aware of my laziness. I started walking more and took the long route, around campus, from my office to the car. Sometimes, we just need to see proof to change. As if a pot belly and excessive sweating wasn’t enough.

  • I just added the Browser Statistics add-on to my Firefox browser. On the bottom left corner, it shows the number of kilobytes downloaded for the current page, total number of kilobytes downloaded since the last start of the browser, and number of pages loaded. I’m going to try to log these numbers each day and try to make use of the data (uh, if laziness doesn’t get the best of me). If only there were some automated data logging.