• Porn star demographics

    February 15, 2013 to Statistics by Nathan Yau

    Porn star hair color

    Jon Millward explored porn star demographics using a data scrape from the Internet Adult Film Database: hair color, race, and birthplace, among other things. (There aren't any dirty pictures, but there's some terminology that might be NSFW.)

    The average measurements?

    I thought that maybe if the women are overestimating how light they are, they might also be a bit too generous when reporting their measurements. It turns out they probably aren’t though, because the most common bra size for a female porn star is a surprisingly handleable 34B. Not double-D, not even a D. Double-D actually came in 4th, behind B, C and D. The most common set of measurements for the women was 34–24-34.

    So, if the average female porn star is a 5'5" woman who weighs 117lbs and has B-cup breasts, what colour is her hair? Blonde, presumably, if my friends' guesses were anything to go by.

    Apparently not. Dark-haired porn stars outnumber blonde ones almost 2-to-1.

    Millward doesn't look at changes over time a whole lot, but if the BMI of Playboy playmates is any indicator, I bet those measurements have changed over the years.

  • A shroud of cold air descends on the U.S.

    February 15, 2013 to Mapping by Nathan Yau

    From NOAA, an animation showing a wave of cold during the Martin Luther King Jr. holiday weekend last month:

    A drop in the jet stream sent temperatures across the United States plummeting over the Martin Luther King Jr Holiday weekend. The pronounced change in temperatures can be seen in this weather data from NOAA/NCEP's Real-Time Mesoscale Analysis. Areas colored blue are below freezing. The diurnal cycle of heating and cooling can be seen over time, but the pattern is clear: much of the U.S. is pretty cold.

    While you're at it, you might as well check out other videos on the NOAA Visualizations YouTube channel. Some good stuff.

  • Redrawn United States of electoral votes

    February 14, 2013 to Mapping by Nathan Yau

    Electoral college reform (fifty states with equal population)

    Neil Freeman reimagined state boundary lines based on population. He started with an algorithm and the fifty largest cities, considered proximity, urban area, and commuting patterns, and then hand-tweaked boundary lines and shapes. The state names are mostly centered around geographic features (although I would have opted for ones based on dating profiles).

    "Keep in mind that this is an art project, not a serious proposal, so take it easy with the emails about the sacred soil of Texas." [via kottke | Thanks, Mickey]

  • Visualization spectrum

    February 13, 2013 to Visualization by Nathan Yau

    A handful of experts weighed in on visualization as a spectrum rather than an unyielding tool.

    The panelists emphasized repeatedly that data visualization exists on a spectrum. On one side are the pieces that are purely aesthetic and emotional, and on the other, the focus is purely on conveying the insights found in the data. Tom Carden, a data visualization engineer at Square, asks himself if the goal is to grab attention for a new idea, or to build a tool that will be used on an ongoing basis: "Tools need to be actionable, auditable, and they have to stand up to scrutiny long-term." Tools should be able to accommodate new data, he said, and should grow with companies in such a way that people aren’t surprised by a difference between this week and last week.

    From the other side of the spectrum are different types of insight that can be emotional, reflective, or just darn funny. This is equally important to analytic insight that you get from tools, and they feed in to each other providing a more realistic view of what data really represents.

    Some illustrated notes from the panel:

  • State of the Union address decreasing reading level

    February 12, 2013 to Statistical Visualization by Nathan Yau

    State of the Union address reading level

    With the State of the Union address tonight, The Guardian plotted the Flesh-Kincaid grade levels for past addresses. Each circle represents a state of the union and is sized by the number of words used. Color is used to provide separation between presidents. For example, Obama's state of the union last year was around the eighth-grade level, and in contrast, James Madison's 1815 address had a reading level of 25.3.

    My guess is this has to do with changes in how we write and talk more than anything else. Lee Drutman and Dan Drinkard for the Sunlight Foundation ran a more rigorous analysis on Congressional records back in May, and the declining trend is similar.

  • The many relationships of Zeus

    February 12, 2013 to Network Visualization by Nathan Yau

    Zues affairs

    Viviana Ferro, Ilaria Pagin, and Elisa Zamarian had a look at all of Zeus's relationships according to many authors over the years. Each person on the inside of a circle represents a lover, and the colored branches connect to children. Start with Zeus, the largest black dot near the middle, and then work your way out.

  • A fill-in-the-blank book to journal your life in graphs

    February 11, 2013 to Self-surveillance by Nathan Yau

    My life in graphsMy friends just got this for me, and it's pretty much the perfect gift, especially since my dissertation is about journaling and personal data collection. My Life in Graphs: A Guided Journal is a book of blank charts and graphs, and you fill in the blanks. For example, there's a map to mark your travel destinations and an x-y plot to evaluate "bucket-list viability."

    I worked on mine a couple of days ago and showed it to my wife. She said I was like a kid showing off his homework. I think that's a good thing.

  • Mapping translations of Othello

    February 8, 2013 to Mapping by Nathan Yau

    Transvis

    Tom Cheesman of Swansea University, along with Kevin Flanagan and Studio NAND, dives into translations of Shakespeare's Othello with TransVis.

    TransVis collects, digitises, analyses and compares translations and variations of literary works. In an initial prototype named VVV (»Version Variation Visualisation«), we have proposed analysis methods, interfaces and visualization tools to explore 37 translations of Shakespeare’s Othello into German with more works translated into other languages to come.

    The map is more of a browser to see where specific publications were written, rewritten and published, but I wonder if you'll see anything interesting if you looked at just where something is rewritten or translated. It'd be like seeing ideas spreading. Or you know, Twilight copies.

  • Analysis of LEGO brick prices over the years

    February 7, 2013 to Statistics by Nathan Yau

    Cost of LEGO bricks

    Reality Prose has an excellent analysis on the changing price of LEGO bricks over the years and a misconception that cost has gone up. According to the chart above, based on data from BrickSet and adjusted for inflation, the average cost per brick has come down.
    Continue Reading

  • Philosophy of data

    February 6, 2013 to Statistics by Nathan Yau

    David Brooks for The New York Times on the philosophy of data and what the future holds:

    If you asked me to describe the rising philosophy of the day, I’d say it is data-ism. We now have the ability to gather huge amounts of data. This ability seems to carry with it certain cultural assumptions — that everything that can be measured should be measured; that data is a transparent and reliable lens that allows us to filter out emotionalism and ideology; that data will help us do remarkable things — like foretell the future.

    Be sure to read the comments. There's actually quite a bit of anti-data talk.

  • Local pub flowchart

    February 5, 2013 to Miscellaneous by Nathan Yau

    Pub flowchart

    From Reddit user ddurrr while visiting London. Pretty much my current status and mental capacity. [Thanks, Tom]

  • A visual exploration of US gun murders

    February 4, 2013 to Infographics by Nathan Yau

    Gun murders with a shotgun

    Information visualization firm Periscopic just published a thoughtful interactive piece on gun murders in the United States, in 2010. It starts with the individuals: when they were killed, coupled with the years they potentially lost. Each arc represents a person, with lived years in orange and the difference in potential years in white. A mouseover on each arc shows more details about that person.
    Continue Reading

  • Time running parallel

    February 1, 2013 to Data Art by Nathan Yau

    In Waters Re~ artist Xárene Eskandar placed video of the same landscape at different times of day in parallel.

    They capture the subjective and perceptual qualities of time expressed as events, moments, memory and landscape. The goal is to break the linear experience of time, allowing viewers to perceive multiple times within a single viewpoint. As a result insignificant moments become significant events, heightening one's experience of the landscape and one's existence in that particular moment in time and space.

    The results are beautiful. [via FastCo]

  • Super Bowl ad costs vs. company profit during game

    February 1, 2013 to Statistical Visualization by Nathan Yau

    ad-spending-and-profits-smallerRitchie King for Quartz compared money spent on Super Bowl ads — now about $3.75 million for a 30-second spot — to how much the companies make on average in 3 and a half hours (the average length of a game).

    It's impossible to say exactly how much a successful Super Bowl ad ultimately earns a company. Surely the Wassup commercials were a huge boon for the Budweiser brand—but how huge?

    One thing is clear though: for the biggest advertisers, that $3.75 million is truly a pittance. In fact, some of them make almost as much in profits in an average 3.5 hours—roughly the time it takes to air the Super Bowl itself.

    Note that spending (on the bottom) is total between 2002 and 2011, and the vertical scales are different (so it probably would've been good to give more visual separation between the two charts), but still, kind of an interesting perspective.

  • The most poisoned name in US history

    January 31, 2013 to Statistics by Nathan Yau

    Poisoned names

    Biostatistics PhD candidate Hilary Parker dived into the most poisoned names in US history. Her own name topped the list. There were several fad names such as Deneen, Catina, and Farrah that saw a quick spike and then a plummet, but the trend for Hilary is different.

    "Hilary", though, was clearly different than these flash-in-the-pan names. The name was growing in popularity (albeit not monotonically) for years. So to remove all of the fad names from the list, I chose only the names that were in the top 1000 for over 20 years, and updated the graph (note that I changed the range on the y-axis).

    I think it's pretty safe to say that, among the names that were once stable and then had a sudden drop, "Hilary" is clearly the most poisoned.

    There it is minding its own business, enjoying a steady rise in popularity over a few decades, and then boom, Bill Clinton is elected, and the name dies a quick death.

    Be sure to check out the rest of the analysis. Good stuff. [Thanks, @hspter]

  • Mercator map puzzle

    January 31, 2013 to Mapping by Nathan Yau

    Mercator puzzle

    The Mercator projection can be useful for giving directions, but when it comes to world maps, the projection doesn't hold up well as you move far north and south. By how much? Give this puzzle game a try and match the red boundaries to their respective countries.

  • Baseball Hall of Fame voting trajectories

    January 30, 2013 to Statistical Visualization by Nathan Yau

    Hall of fame voting trajectories

    Carlos Scheidegger and Kenny Shirley, along with Chris Volinsky, visualized Major League Baseball Hall of Fame voting, from the first class in 1936 (which included Babe Ruth) up to present.

    All a fan can do is accept that Baseball Hall of Fame voting, conducted by the Baseball Writers Association of America (BBWAA), is a phenomenon unto itself. If we can't understand baseball Hall of Fame voting, though, maybe the next best thing is visualizing the data behind it. The set of interactive plots on this webpage is our attempt to do that. We were especially interested in two things: (1) viewing the trajectories of BBWAA vote percentage by year for different players throughout history, and (2) simultaneously viewing the career statistics of these players, to help find patterns and explain their trajectories (or to reassure ourselves that the writers really are crazy).

    The interactive is on the analysis side of the spectrum, so you might be a bit lost if you don't know a lick about baseball. However, if your're a baseball fan, there's a lot to play around with and dimensions to poke around at, as you can filter on pretty much all player stats such as home run count, batting average, and innings played. At the very least, you're getting a peek at how statisticians pick and prod at their data.

    Start at the examples section for quick direction. I eventually found myself looking for downward trajectories. Poor Mark McGwire. [Thanks, Chris]

  • NFL fans on Facebook, based on likes

    January 29, 2013 to Mapping by Nathan Yau

    Football fans in the United States

    As the Super Bowl draws near, Facebook took a look at football fandom across the country.

    The National Football League is one of the most popular sports in America with some incredibly devoted fans. At Facebook we have about 35 million account holders in the United States who have Liked a page for one of the 32 teams in the league, representing one of the most comprehensive samples of sports fanship ever collected. Put another way, more than 1 in 10 Americans have declared their support for an NFL team on Facebook.

    It's a fairly straightforward geographic breakdown based on the most liked team in each county, as shown above. So you can kind of see where rivalries come from.
    Continue Reading

  • Ten years of cumulative precipitation

    January 28, 2013 to Mapping by Nathan Yau

    We've all seen rain maps for a sliver of time. Screw that. I want to see the total amount of rainfall over a ten-year period. Bill Wheaton did just that in the video above, showing cumulative rainfall between 1960 and 1970. The cool part is that you see mountains appear, but they're not actually mapped.

    The hillshaded terrain (the growing hills and mountains) is based on the rainfall data, not on actual physical topography. In other words, hills and mountains are formed by the rainfall distribution itself and grow as the accumulated precipitation grows. High mountains and sharp edges occur where the distribution of precipitation varies substantially across short distances. Wide, broad plains and low hills are formed when the distribution of rainfall is relatively even across the landscape.

    See also Wheaton's video that shows four years of rain straight up.

    Is there more recent data? It could be an interesting complement to the drought maps we saw a few months ago. [Thanks, Bill]

  • Internet Explorer causation

    January 25, 2013 to Miscellaneous by Nathan Yau

    I'm almost certain this relationship is significant. Side note: Is there a meaningless-correlations tumblr yet? [via]

    Internet Explorer vs Murder Rate

Unless otherwise noted, graphics and words by me are licensed under Creative Commons BY-NC. Contact original authors for everything else.