In light of the Donald Sterling brouhaha, Amanda Cox for The Upshot put up some charts for why you shouldn't be surprised that people still say racist things, based on data from the General Social Survey dating back to 1972. Mmhm.
Matthew Klein for Bloomberg View explored mortality in America through a slidedeck of charts. The animations in between each slide grows tedious, but the topics covered, going beyond just national mortality rate, are worth browsing. (Although, can someone tell me why the female mortality rate rose between the 1970s and 2000? I know there's a perfectly valid reason behind the trend, but I can't remember.)
The data itself is also worth your time, in case you're looking for a side project. It comes from the Centers for Disease Control and Prevention and spans 1968 through 2010.
I can tell you from experience the data query process isn't the smoothest experience — as much as you can expect from a government site, I guess. That said, the amount of data, with a variety of demographic breakdowns and categorizations, can make for plenty of worthwhile projects. Highly recommended.
This year's polar vortex churned up some global warming skeptics, but as we know, it's more useful to look at trends over significant spans of time than isolated events. And, when you do look at a trend, it's useful to have a proper baseline to compare against.
To this end, Enigma.io compared warm weather anomalies against cold weather anomalies, from 1964 to 2013. That is, they counted the number of days per year that were warmer than expected and the days it was colder than expected.
An animated map leads the post, but the meat is in the time series. There's a clear trend towards more warm.
Since 1964, the proportion of warm and strong warm anomalies has risen from about 42% of the total to almost 67% of the total – an average increase of 0.5% per year. This trend, fitted with a generalized linear model, accounts for 40% of the year-to-year variation in warm versus cold anomalies, and is highly significant with a p-value approaching 0.0. Though we remain cautious about making predictions based on this model, it suggests that this yearly proportion of warm anomalies will regularly fall above 70% in the 2030's.
Explore in full or download the data and analyze yourself. Nice work. [Thanks, Dan]
Kevin Wu made a straightforward interactive that lets you see IMDB television ratings over time, per episode and by season.
Stamen visualized Bitcoin activity, noting a variety of traders who knew what they were doing, didn't know what they were doing, and were apparently automated.
In February 2014 MtGox, one of the oldest Bitcoin exchanges, filed for bankruptcy protection. On March 9th a group posted a data leak, which included the trading history of all MtGox users from April 2011 to November 2013. The graphs below explore the trade behaviors of the 500 highest volume MtGox users from the leaked data set. These are the Bitcoin barons, wealthy speculators, dueling algorithms, greater fools, and many more who took bitcoin to the moon.
Pantheon, a project from the Macro Connections group at The MIT Media Lab, explores cultural influences across countries and domains.
To make our efforts tractable, Pantheon will not focus on culture, as it is understood in its broadest sense, but on cultural production. In a broad sense, culture can be understood as all of the information that humans—or animals —generate and transmit through non-genetic means . At Pantheon, however, we do not focus on the entire range of cultural information, but in a subset of this information that we define narrowly as cultural production. That is, we do not focus on cultural information such as passed on family values or societal trust , but on cultural production as proxied by the biographies of notable historical characters. Moreover, we focus on the subset of cultural production that we can identify as global culture, meaning the subset of cultural production that has broken the barriers of space, time and language.
Rankings inevitably come into play, such as who the most influential philosopher, physicist, or country is, and the project covers a broad spectrum, so the methodology is the most important here. Using data from Wikipedia, Freebase, and other online sources, the researchers created several indices that essentially give a score to individuals for popularity and production. This naturally results in estimation fuzziness, which means you take the results with salt and all that.
It's an interesting look though and a good start to something bigger. If anything, you'll probably learn something new after poking around for a bit.
Because you get more pizza to eat, and if you don't finish it, you'll have breakfast tomorrow. Other than that fine reason, well, it's geometrically the better deal. Planet Money explains with an interactive that shows the price per square inch for 3,678 pizza places across the United States, based on data from Grubhub.
The math of why bigger pizzas are such a good deal is simple: A pizza is a circle, and the area of a circle increases with the square of the radius.
More pizza more problems
So, for example, a 16-inch pizza is actually four times as big as an 8-inch pizza.
And when you look at thousands of pizza prices from around the U.S., you see that you almost always get a much, much better deal when you buy a bigger pizza.
You get more pizza, and the business gets more money with minimal extra pizza-making effort. Win-win. Although, keep going on the horizontal axis and I bet that curve starts to curl up. Where can I get a ten-foot pizza?
Maris Jensen just made SEC filings readable by humans. The motivation:
But in the twenty years since, despite hundreds of millions invested in rounds of contracted EDGAR modernization efforts and interactive data false starts, the SEC's EDGAR has remained almost untouched. In 2014, the SEC is quite literally doing less with SEC filings than their predecessors had planned for 1984. Data tagging is the red-headed stepchild of the Commission -- out of hundreds of forms, only about a dozen are filed as structured data -- and the first program to automate the selection of SEC filings for review, the Division of Economic and Risk Analysis (DERA)'s 'Robocop', has been 'aspirational' for years. The academics in the division responsible for the SEC's interactive data initiatives write papers about information asymmetry, using EDGAR data they repurchase in usable form for millions each year, but do nothing to fix it. Companies are chastised for insufficient and inefficient disclosure, while the SEC fails to help retail investors navigate corporate disclosures at all.
Look up a company and see their financials, ownership, influences, and board members, among other things typically not so straightforward to look up.
Two Google research groups, Big Picture and Music Intelligence, got together and made a music timeline baby.
The Music Timeline shows genres of music waxing and waning, based on how many Google Play Music users have an artist or album in their music library, and other data (such as album release dates). Each stripe on the graph represents a genre; the thickness of the stripe tells you roughly the popularity of music released in a given year in that genre. (For example, the "jazz" stripe is thick in the 1950s since many users' libraries contain jazz albums released in the '50s.) Click on the stripes to zoom into more specialized genres.
As you'd expect, the initial view is a stacked area chart that represents the popularity of genres over time, which feels fairly familiar, but then you interact with the stacks and it gets more interesting and almost surprisingly fast. The best part is the pointers to specific albums as you mouse over.
When you watch sports, it can sometimes feel like the stat guy pulls random numbers for the talking heads to ponder, and you can't help but wonder who significant the numbers actually are. Benjamin Schmidt shows all the possibilities for a common statement during baseball games, and it turns out there are a lot of statements to pick from.
Statements of the form "Jack Morris won more games in the 1980s than anyone else" are fascinating. Although they're true, they rest on cherry-picked years that may or may not illustrate a deeper truth in context. (And we see them all the time: see my college degrees cherry-picker for another area.) For baseball, there are thousands of statements just like the ones here that you can make about any single cumulative stat over the game's history--10,296, to be exact. Printed out, all the statements you could make with the data here would take about 15,000 pages: this visualization lets you hone in on the patches of interest.
In 1976, Dwight E. Robinson, an economist at the University of Washington, studied facial hair of the men who appeared in the Illustrated London News from 1842 to 1972 [pdf].
The remarkable regularity of our wavelike fluctuations suggests a large measure of independence from outside historical events. The innovation of the safety razor and the wars which occurred during the period studied appear to have had negligible effects on the time series. King C. Gillette's patented safety razor began its meteoric sales rise in 1905. But by that year beardlessness had already been on the rise for more than 30 years, and its rate of expansion seems not to have augmented appreciably afterward.
Someone has to update this to the present. I'm pretty sure we're headed towards a bearded peak, if we're not at the top already.
New Year's is a worldwide event, but as we know, it doesn't happen simultaneously everywhere. Midnight happens in different time zones and in various languages, so Krist Wongsuphasawat from Twitter visualized the event in an animated interactive, as people tweeted happy new year around the world. Press play and see how it happened.
The best part is that UTC+01:00 area that covers Central Europe and Western Africa. Spikes in 16 languages by my count.
Engineering and psychology researchers in Finland investigated where we feel and don't feel.
The team showed the volunteers two blank silhouettes of person on a screen and then told the subjects to think about one of 14 emotions: love, disgust, anger, pride, etc. The volunteers then painted areas of the body that felt stimulated by that emotion. On the second silhouette, they painted areas of the body that get deactivated during that emotion.
The body maps above show the results of the survey. As you'd expect, the body looks like it shuts down with depression, and it lights up with happiness, but it's the subtle differences that are most interesting. I like the contrast between pride and anger, a difference of fists and feet.
Computer science PhD student Randy Olson likes to analyze reddit in his spare time. We saw his network of subreddits already, but his look earlier this year at the evolution of reddit is more interesting. The yearly breakdowns and explanations are the best part. I'm relatively new to reddit (and totally feel like an old man when I visit), so it's fun to see what the site used to be. More news and fewer Scumbag Steves, with a humble beginning in nsfw?
For the downtime post-turkey. James Trimble stuck the top 200 reddits of all time into a treemap. Let the time suck begin.
Presented mostly for my fond memories as a grade schooler, with a fresh 2400 bps modem in the 486, who recently discovered something called a BBS. Those were the good old days. My dad got me a 50-foot phone line to run from the computer to the phone jack in the back corner of another room.
Dan Delany took a simple look at furloughed employees due to the government shutdown. There are tickers for duration, estimated unpaid salary, and estimated food vouchers unpaid, but the main view is the interactive tree map that shows furloughed proportions by department.
Weddings are special events where friends and family come together to celebrate, and we encapsulate them in their special day. What if you looked at weddings over time though? Todd Schneider provides a view into wedding announcements in The New York Times in Wedding Crunchers, and although the announcements are mostly New York-based, you get a peek into events and social trends. Simply enter terms or phrases and see the trends over time.
Be sure to check out Schneider's detailed description and highlights of the data. [Thanks, Todd]
Forecast, one of the best if not the best quick-look weather sites, uses various weather models to predict temperature, wind, humidity, and pressure. Whereas the main result is an estimated map view along with highs and lows for the week, Forecast Lines shows you the the weather models that drive the site.
Forecast works by statistically aggregating a number of different weather models into a single forecast. Because I can peek under the hood, I was able to take a look at all the raw models and see how many dipped below freezing. I saw that none of them did, which gave me confidence that my plants would be okay.
Today we’re launching a new weather app that lets everyone “peek under the hood.” We’re calling it Forecast Lines.
And like the main Forecast site, it works fine and dandy on your iPad or mobile device.
Allison McCann for Businessweek graphed rappers' claimed wealth in their songs versus their actual wealth.
Fresh off of Jay-Z's new album is the track Versus, on which he chides fellow hip-hop artists and their dubious tales of extraordinary wealth: "The truth in my verses, versus, your metaphors about what your net worth is." Like Jay-Z, we’ve long been skeptical of just how wealthy some hip-hop stars claim to be, so we created a way to separate the truly rich from the loud-mouth lyricists.
As you'd expect, some rappers tend to exaggerate. Speaking of which, this seems like a good time to revisit the map that shows the area codes where Ludacris claims to have hoes. Unfortunately, there is no data to verify or debunk.