Statisticians everywhere are squealing in delight over this story on fellow statistician Mohan Srivastava, who used his know-how to crack the code of a tic-tac-toe scatcher lottery game. After winning three dollars on a scratcher ticket that was given to him as a gag gift, Srivastava got to wondering about the process of how tickets were made. As a geological consultant who figures out if areas are worth mining for gold, he wondered if he could do the same with this scatcher.
Read More
-
-
Here are some links for you to accompany your bag of chips, bowl of chili, and plate of wings as you wate for the Super Bowl to start.
Super Bowl FanMap — Who are people picking to win the big game in your area? ESRI is taking votes and mapping them. Packers are far ahead, leading 64 percent to 36. [via]
Where the Streets Have Your Name — See where there are streets named after you. Yeah, you.
d8taplex — A super alpha version of something that looks like a data search engine with graphs.
Top-spending cities for personal care — Active north and sedentary south?
-
Designer Ibraheem Youssef iconifies the most viewed YouTube videos of all time. Do you recognize what each icon represents? I’m embarrassed to say that I probably know one too many of them.
Read More -
Fortune Magazine recently published their annual list of top companies to work for, with SAS, Boston Consulting, and Wegman’s taking the one, two, and three spots, respectively. To accompany the piece, this interactive, produced by Tommy McCall, shows what the employees have to say about their companies.
Read More -
Apparently moods on Twitter can be used to predict the ups and downs of the stock market, according to work from Johan Bollen and Huina Mao of Indiana University-Bloomington: “Measuring how calm the Twitterverse is on a given day can foretell the direction of changes to the Dow Jones Industrial Average three days later with an accuracy of 86.7 percent.”
I can’t wait until Twitter is used to predict when I want to eat and sleep, and my robot can cook me gourmet meals and provide turn down service accordingly. And it better be accurate to the minute. Anything less is failure.
-
Christopher Beam for Slate explains research being done at UCLA in collaboration with the LAPD on predictive policing:
Predictive policing is based on the idea that some crime is random—but a lot isn’t. For example, home burglaries are relatively predictable. When a house gets robbed, the likelihood of that house or houses near it getting robbed again spikes in the following days. Most people expect the exact opposite, figuring that if lightning strike once, it won’t strike again. “This type of lightning does strike more than once,” says Brantingham. Other crimes, like murder or rape, are harder to predict. They’re more rare, for one thing, and the crime scene isn’t always stationary, like a house. But they do tend to follow the same general pattern. If one gang member shoots another, for example, the likelihood of reprisal goes up.
This happened in my neighborhood when I was in fifth grade. We lived in a pretty quiet neighborhood, but one morning a window was open. Someone had come into our house while we were sleeping and stole whatever was in immediate reach. They also stole my dad’s brand new bicycle from the garage. Same thing happened to my neighbor two days later.
[Slate via @amstatnews]
-
Even Bill Gates has an infographics section. In his 2011 annual letter, Gates focuses on Polio and vaccines, and uses graphics to highlight spots. Most of them have to do with the decrease in number of Polio cases and increase in vaccine coverage, but there’s one graph that I gave a double take. It shows the correlation between IQ and disease burden. Question of the day: if we decrease disease burden in a country by improving healthcare (or availability of vaccines), will the country as a whole become smarter, or are better educated people generally healthier?
[Gates Foundation | Thanks, Michael]
-
In regards to a performance chart posted by Netflix, Andy Baio, who along with around 7 percent of men, is colorblind, explains why it’s so hard to read the chart. “When doing the right thing is this easy, it’s really disturbing when it’s dismissed as a waste of time.”
[Waxy]
-
I post this graphic by Muller on the Coen brothers filmography mostly because, well, of the Coen brothers filmography. I also kind of like the name. Main characters are shown from most recent on down and connecting lines show previous Coen films that actor was in.
[Muller | Thanks, Thomas]
-
I’m partial to all things food and drink related, so naturally my eyes light up when they’re combined with charts. Fabio Rex illustrates what makes the perfect drink in a set of pie charts and annotated glasses. Below, Rex describes the perfect Mojito and above are breakdowns via pie chart of various other drinks.
Read More -
The Chronicle of Higher Education lets you explore the percentage of adults with college degrees from 1940 up to present, by county. Press play and watch the national average go up from 4.6 percent to 27.5, or select a county for breakdowns and a time series.
Read More -
Add another online destination to find the data that you need. DataMarket launched back in May with Icelandic data, but just a few days ago relaunched with data of the international variety. They tout 100 million time series datasets and 600 million facts. I’m not totally sure what that means (100 million lines, sets of lines?), but I take it that means a lot.
Just over 2 years and countless cups of coffee after we started coding, DataMarket.com launches with international data. You can now find, visualize and download data from many of the world’s most important data providers on our site.
At first glance DataMarket feels a lot like now defunct Swivel. Search for the data you want and you get back a list of datasets. The focus on only time series though is actually a plus in that they can provide more specific tools to visualize and explore. The current toolset isn’t going to blow you away, but it’s not bad.
Read More -
My many thanks to the FlowingData sponsors. They help me keep the servers running and the posts coming. Check ’em out. They help you understand your data.
InstantAtlas – Enables information analysts and researchers to create highly-interactive online reporting solutions that combine statistics and map data to improve data visualization, enhance communication, and engage people in more informed decision making.
Tableau Software — Combines data exploration and visual analytics in an easy-to-use data analysis tool you can quickly master. It makes data analysis easy and fun. Customers are working 5 to 20 times faster using Tableau.
Want to sponsor FlowingData? Contact me at [email protected] for more details.
-
In a survey of rankings from a variety of sources, Pleated Jeans maps the United States of Shame. Because all states must be bad at something. Go, California. If we’re the worst at air pollution, does that mean we actually have really clean air? Must be.
Read More -
DeviantArt user dehahs, who seems to enjoy making graphics based on fiction (see here and here), classifies kills by main character Dexter of the popular Showtime series of the same name. Each kill is color-coded by type and weapon used is provided. Estimated number of deaths caused by killee is also provided on bottom by red dots. It’s kind of gruesome, but any Dexter fan will appreciate this.
[DeviantArt via datavis]
-
Mina Liu and Oliver Uberti for National Geographic examine the most common surnames across the country:
What’s in a Surname? A new view of the United States based on the distribution of common last names shows centuries of history and echoes some of America’s great immigration sagas. To compile this data, geographers at University College London used phone directories to find the predominant surnames in each state. Software then identified the probable provenances of the 181 names that emerged.
The most common surnames are then placed geographically and colored by origin. Browse the full-sized map here. Is your name in there?
-
Okay, I sort of dropped the ball on this one. I have a free pass up for grabs to the O’Reilly Strata Conference next week, February 1 to 3. Here’s the short description:
Unprecedented computing power and connectivity are bringing new layers of experience to our lives in how we manage and present data sets of all sizes.
Throughout three days of training, breakout sessions, and plenary discussions, O’Reilly Strata connects the decision-makers, practitioners, and leading vendors from enterprise and the web who are at the leading edge of this space. Topics include data science, acquisition, organization, machine learning, visualization, and more.
Want to win the free pass? Leave a comment below by Friday, January 28, 2011 at 7pm PST. Tell us what super power you’d want if you could only pick one. I’ll pick a comment randomly and email you the discount code. Please only enter if you know you can attend February 1-3 in Santa Clara, California. I’d hate the pass to go to waste.
If you don’t want to leave it up to the randomized gods and just want to register now, it’s not too late to do that either. You can register here and get a 25% discount. The program looks like it’ll be a good one.
Update: Congrats to Tyson! “I’d like to fly”
-
In the spirit of the well-circulated Facebook friendship map by Paul Butler, research analyst Olivier Beauchesne at Science-Metrix examines scientific collaboration around the world from 2005 to 2009:
I was very impressed by the friendship map made by Facebook intern, Paul Buffer [sp] and I realized that I had access to a similar dataset. Instead of a database of friendship data, I had access to a database of scientific collaboration.
From an extensive database of academic citations:
I extracted and aggregated scientific collaboration between cities all over the world. For example, if a UCLA researcher published a paper with a colleague at the University of Tokyo, this would create an instance of collaboration between Los Angeles and Tokyo.
After that, Beauchesne used a similar mapping scheme that Butler used, and behold the results above. The brighter the lines, the more collaborations between a pair of universities.
Read More -
President Barack Obama delivered his State of the Union address yesterday, and this year it was “enhanced” by charts and graphs. Basically, as Obama spoke, graphics that you could equate to Powerpoint slides showed up on the side. What’d you think of the enhancement? Did it add or detract from the message? Were the graphics used honestly and effectively?
One thing’s for sure: there’s something wrong with that bubble chart. Uh oh.
Read More -
Cartographer Daniel Huffman has a look at swearing in the United States, according to geocoded tweets:
Isolines are based upon an interpolated surface generated from approximately 1.5 million geocoded public posts on Twitter between March 9th and April 12th, 2010. These data represent only a sample of all posts made during that period. Isolines are based upon the average number of profanities found in the 500 nearest data points, in order to compensate for low population areas.
The brighter the red, the more profanities used in the area, and the more black, the less swearing. Words looked for were (pardon my language): fuck, shit, bitch, hell, damn, and ass, and variants such as damnit. Honestly, I never swear like this. Unless some idiot pickup truck tailgates me going 80 on the highway in the middle of the night. That doesn’t count though.
Read More