• Analysis of baseball ticket pricing

    April 10, 2013  |  Statistical Visualization

    Baseball ticket pricing

    If you've ever looked at ticket prices for sporting events, you probably noticed the disparity in prices of when your team plays a popular team or a rival versus a less than stellar team. Last time I looked a ticket to watch the Golden State Warriors play the Lakers or Heat was twice as much as when they played the Kings. David Yanofsky for Quartz noted the same pricing strategy in baseball.

    The heat map above shows the effect of visiting teams on ticket prices. As you'd expect (if you follow baseball even just a tiny bit), price goes up significantly when the New York Yankees come to town. In contrast, the price goes down when the Seattle Mariners show up.

    There's clearly a supply and demand thing going on here. Nobody wants to see bad teams play. But now it's time to pull a Billy Beane. How little can you spend on a team and a stadium and still make a profit? [Thanks, David]

  • Chartspotting: Coffee graph menu

    March 29, 2013  |  Statistical Visualization

    Coffee menu

    FlowingData reader Amir sent this along. In lieu of a list of coffee drinks, this place in in East London opted for ingredient breakdowns. I'm guessing there's a standard menu outside the frame, because otherwise, coffee neophytes (like me) would have no clue what to do. Anyone care to fill in the blanks?

    Spot any charts in the wild? You should email me a picture.

  • Internet Census

    March 22, 2013  |  Statistical Visualization

    Internet map

    Upon discovering hundreds of thousands open embedded devices on the Internet, an anonymous researcher conducted a Census of the Internet, mapping 460 million IP addresses around the world.

    While playing around with the Nmap Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address usage.

    It's a pretty thorough analysis, but the conclusion interested me most:

    The why is also simple: I did not want to ask myself for the rest of my life how much fun it could have been or if the infrastructure I imagined in my head would have worked as expected. I saw the chance to really work on an Internet scale, command hundred thousands of devices with a click of my mouse, portscan and map the whole Internet in a way nobody had done before, basically have fun with computers and the Internet in a way very few people ever will. I decided it would be worth my time.

    It makes me feel...uneasy. [Thanks, Roger]

  • Bettings lines for becoming the next pope

    March 5, 2013  |  Statistical Visualization

    Probability of next pope

    Who's going to be the next pope? I know all of you are sitting on the edge of your seats. Luckily, an analytical research manager who goes by the name AJ hacked together a pope tracker.

    Despite not being Catholic, the papal election fascinates me. Not sure if it’s the old rituals, the world-wide interest, or simply the fact that the Catholic Church has left a huge mark on history.

    There’s no way I know enough about the inner workings of the Catholic Church to have any idea on who the next Pope may be.

    Since domain knowledge is out, the next best option?

    Follow the money!

    He's scraping odds of possible candidates becoming pope from a betting site, and the above shows the numbers over time. The odds were bumpy at first, but there seems to be some convergence, and as of this writing, Cardinal Peter Turkson from Ghana is the heavy favorite. [via Revolutions]

  • State of the Union address decreasing reading level

    February 12, 2013  |  Statistical Visualization

    State of the Union address reading level

    With the State of the Union address tonight, The Guardian plotted the Flesh-Kincaid grade levels for past addresses. Each circle represents a state of the union and is sized by the number of words used. Color is used to provide separation between presidents. For example, Obama's state of the union last year was around the eighth-grade level, and in contrast, James Madison's 1815 address had a reading level of 25.3.

    My guess is this has to do with changes in how we write and talk more than anything else. Lee Drutman and Dan Drinkard for the Sunlight Foundation ran a more rigorous analysis on Congressional records back in May, and the declining trend is similar.

  • Super Bowl ad costs vs. company profit during game

    February 1, 2013  |  Statistical Visualization

    ad-spending-and-profits-smallerRitchie King for Quartz compared money spent on Super Bowl ads — now about $3.75 million for a 30-second spot — to how much the companies make on average in 3 and a half hours (the average length of a game).

    It's impossible to say exactly how much a successful Super Bowl ad ultimately earns a company. Surely the Wassup commercials were a huge boon for the Budweiser brand—but how huge?

    One thing is clear though: for the biggest advertisers, that $3.75 million is truly a pittance. In fact, some of them make almost as much in profits in an average 3.5 hours—roughly the time it takes to air the Super Bowl itself.

    Note that spending (on the bottom) is total between 2002 and 2011, and the vertical scales are different (so it probably would've been good to give more visual separation between the two charts), but still, kind of an interesting perspective.

  • Baseball Hall of Fame voting trajectories

    January 30, 2013  |  Statistical Visualization

    Hall of fame voting trajectories

    Carlos Scheidegger and Kenny Shirley, along with Chris Volinsky, visualized Major League Baseball Hall of Fame voting, from the first class in 1936 (which included Babe Ruth) up to present.

    All a fan can do is accept that Baseball Hall of Fame voting, conducted by the Baseball Writers Association of America (BBWAA), is a phenomenon unto itself. If we can't understand baseball Hall of Fame voting, though, maybe the next best thing is visualizing the data behind it. The set of interactive plots on this webpage is our attempt to do that. We were especially interested in two things: (1) viewing the trajectories of BBWAA vote percentage by year for different players throughout history, and (2) simultaneously viewing the career statistics of these players, to help find patterns and explain their trajectories (or to reassure ourselves that the writers really are crazy).

    The interactive is on the analysis side of the spectrum, so you might be a bit lost if you don't know a lick about baseball. However, if your're a baseball fan, there's a lot to play around with and dimensions to poke around at, as you can filter on pretty much all player stats such as home run count, batting average, and innings played. At the very least, you're getting a peek at how statisticians pick and prod at their data.

    Start at the examples section for quick direction. I eventually found myself looking for downward trajectories. Poor Mark McGwire. [Thanks, Chris]

  • Character mentions in Les Miserables

    January 14, 2013  |  Statistical Visualization

    Les mis character mentions

    Jeff Clark took a detailed look at Victor Hugo's Les Miserables via character mentions, word connections, and word usage. The above is character mentions with color showing sentiment. Red means negative, and blue positive.

    Characters are listed from top to bottom in their order of appearance. The horizontal space is segmented into the 5 volumes of the novel. Each volume is subdivided further with a faint line indicating the various books and, finally, small rectangles indicate the chapters within the books. In the 5 volumes there are a total of 48 books and 365 chapters. The height of the small rectangles indicate how frequently that character is mentioned in that particular chapter.

    There's a good amount of blue towards the end, when everyone decides everyone else isn't so bad.

    See the full version and other views here.

  • Five years of traffic fatalities

    January 8, 2013  |  Statistical Visualization

    Traffic fatalities - alcohol a factor

    I made a graphic a while back that showed traffic fatalities over a year. John Nelson extended on that, pulling five years of data and subsetting by some factors: alcohol, weather, and if a pedestrian was involved. And he aggregated by time of day and day of week instead of calendar dates.
    Continue Reading

  • Longer life expectancy, more years of disease

    December 19, 2012  |  Statistical Visualization

    Life expectancy and healthy years

    Bonnie Berkowitz, Emily Chow and Todd Lindeman for the Washington Post plotted life expectancy against percentage of healthy years. Although life expectancy is increasing, the percentage of years living without disease isn't quite keeping up.

    People are living longer lives, but the time they are gaining isn't entirely time with good health. For every year of life expectancy added since 1990, about 9 1/2 months is time in good health. The rest is time in a diminished state — in pain, immobility, mental incapacity or medical support such as dialysis. For people who survive to age 50, the added time is "discounted" even further. For every added year they get, only seven months are healthy.

    On the other hand, total number of expected years in good health is still on the plus-side, and I think most people would choose years in poor health over fewer years. So it's not all bad news.

  • Get a visual recap of your year on Twitter

    December 11, 2012  |  Statistical Visualization

    Year on Twitter

    As 2013 nears, let the recaps, reviews, and best ofs begin. Twitter put up their 2012 year in review of top tweets, trends, and such, which is mostly pictures and lists, but in collaboration with Vizify, they also have a section to visualize your own tweets. Click on the "View year on Twitter" button in the top right. Here's mine, for example. (Surprise, I mention maps, data, and charts often.)

    It's a word frequency chart that shows usage over the year. Scroll left to right or mouse over bubbles to see specific tweets. Mostly, it's just fun to look back. [Thanks, Todd]

  • How tax rates have changed

    November 30, 2012  |  Statistical Visualization

    Changing tax burden

    Mike Bostock, Matthew Ericson and Robert Gebeloff for the New York Times explored changing tax rates from 1980 to 2010, for various income levels.

    Most Americans paid less in taxes in 2010 than people with the same inflation-adjusted incomes paid in 1980, because of cuts in federal income taxes. At lower income levels, however, much of the savings was offset by increases in federal payroll taxes, state sales taxes and local property taxes. About half of households making less than $25,000 saved nothing at all.

    Instead of trying to squeeze everything into one space, the graphic reads like a story, with changes in different types of taxes and comparisons across income levels.

  • Mitt Romney losing likes on Facebook, in real-time

    November 12, 2012  |  Statistical Visualization

    Mitt Romney unlikes on Facebook

    If you go to the Facebook page for Mitt Romney, note the number of likes, wait a few seconds, and then refresh the page. The number of likes is decreasing fast enough that you can see the change over a short period of time. Disappearing Romney charts the change in real-time.

    Tick, tick, tick.

    See also Who Likes Mitt, with the quick API hack on github. [via @moebio]

  • History of film, 100 years in a chart

    November 2, 2012  |  Statistical Visualization

    History of Film

    In something of an homage to the Genealogy of Pop & Rock Music by Reebee Garofalo, designer Larry Gormley visualized 100 years of film.

    This graphic chronicles the history of feature films from the origins in the 1910s until the present day. More than 2000 of the most important feature-length films are mapped into 20 genres spanning 100 years. Films selected to be included have: won important awards such as the best picture Academy Award; achieved critical acclaim according to recognized film critics; are considered to be key genre films by experts; and/or attained box office success.

    Available in print for 34 bones.

  • Lord of the Rings visualized

    October 24, 2012  |  Statistical Visualization

    Decline of the longevity of men

    Driven by his love for Lord of the Rings, Emil Johansson explores the many facets of the world in charts and graphs. For example, the above chart is the declining lifespan of man.

    It is explicitly stated by Tolkien that the longevity of Men once granted to the Númenóreans decreased over the years. In Letter 156 Tolkien writes that "a good Númenórean died of free will when he felt it be the time to do so". With the Shadow and the Downfall of Númenor this grace was taken away from them and they died involuntarily with a decreasing lifespan.

    The decreasing life span is seen clearly in the graph. The most dramatic change is shortly before the Downfall of Númenor. The rulers are shown in order. Their number should not be confused with how many generations from Elros Aragorn is since there were more than one line of rulers.

    There's also a geographic map of where characters traveled, a family tree, a timeline, and even an Android app. I think Johansson might be a superfan. A hunch.

  • Presidential campaign finance explorer

    September 26, 2012  |  Statistical Visualization

    Presidential campaign finance explorer

    Hey, I think it's election season, and you know what that means. It's time to dig into campaign finance data from the Federal Election Commission. The Washington Post gives you a view into the amount of money raised and spent in both camps, where it's coming from and where it's going. They start with the high-level aggregates, and as you scroll down, you get the time series, followed by the breakdowns for money raised.

    The spending categories at the bottom are the most interesting bit. They cover advertising and mail, down to consulting and events. Payroll was a lot higher than I would've thought.

  • Color names plotted against gender

    September 20, 2012  |  Statistical Visualization

    His and Hers Colors by Stephen Von Worley

    A couple of years ago, xkcd ran a survey that asked people to name colors. Stephen Von Worley plotted that data by gender in an interactive.

    That's a dot for each of the 2,000 most commonly-used color names as harvested from the 5,000,000-plus-sample results of XKCD's color survey, sized by relative usage and positioned side-to-side by average hue and vertically by gender preference. Women tend to use color names nearer the top, men towards the bottom, and the dashed line represents the 50-50 split (equal usage by both sexes).

    While his original version was static, the interactive version lets you sort by hue, saturation, brightness, popularity, and name length. Most importantly, you can see the color names now when you mouse over. I like the vertical spectrum of purple, where women use names like bright lilac, orchid, and heather, and men tend to label similar shades as purplish, lightish purple, and oh yes, very light purple. [Thanks, Stephen]

  • Animated political contributions

    September 14, 2012  |  Statistical Visualization

    The Forest of Advocacy is a series of animations that explores the political contribution patterns among eight organizations, such as Bain Capital, Goldman Sachs, and Harvard Business School.

    These visualizations provide a dynamic look at the partisan tilt of giving within organizations. For each organization, individuals are characterized as points sketching out a line over time. The X axis is time, and the Y axis represents the net partisan tilt of contributions over the preceding 6 months. Over the decades, one sees lines sketched out, reflecting the partisanship of individuals over time. For each organization, we also provide the net contributions of the entire organization, and the names of biggest Democratic, Republican, and "bipartisan" contributors (the individual with the highest product of Democratic and Republican contributions).

    At the core, each animation is a time series chart, but the aesthetic and animation, which is narrated, provides for a more organic feel. In particular, the movements of people, represented by squares shifting straight across or up and down, makes it easy to see consistent and not so consistent contributions. [Thanks, Mauro]

  • History of tax breaks

    September 11, 2012  |  Statistical Visualization

    Tax breaks

    Kat Downs, Laura Stanton and Karen Yourish of The Washington Post look at the tax breaks from the 1970s to 2011 in an interactive.

    The U.S. government gives away more than $1 trillion a year in tax breaks — subsidies for individuals and companies that are often substitutes for direct government spending.
    Once written into the tax code, they tend to stick around.

    Each stripe represents a tax break, and height represents the value of the break in 2011. Interaction is key here, which lets you select categories such as education and health and mouse over breaks for more information. The chart above is also linked with a time series, which provides an alternative view to the same data.

  • Wikipedia is dominated by male editors

    September 11, 2012  |  Statistical Visualization

    Wikipedia Gender

    After he saw a New York Times article on the gender gap among Wikipedia contributors (The contributor base is only 13 percent women), Santiago Ortiz plotted articles by number of men versus number of women who edited. It's interactive, so you can mouse over dots to see what article each represents, and you can zoom in for closer look in the bottom left.

    At first glance, the difference doesn't look that big, but notice the values of the axes. The axis for men on the horizontal is from 0 to 200, the axis for women is 0 to 20, and the equal ratio line is the purple one that's nearly vertical. So the only article with more women contributors is on cloth menstrual pads.

    See also: what the chart looks like with equally-spaced increments. The results are clear.

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.