• What Cell Phone Provider is Best For You?

    September 15, 2009  |  Statistics

    Picking a cell phone plan is confusing, but it doesn't have to be.

    Providers purposely make it that way, so you don't see all that you're forking over per month until you're locked into a horrible 2-year plan. It doesn't have to be like this though. Let's look at the data to find what cell phone provider has the best price.
    Continue Reading

  • Low Income Hinders College Attendance, Even for Top Students

    September 1, 2009  |  Statistics

    snap20051012a

    What if you were a good student but knew you weren't going to be able to go to college?

    I was fortunate enough for most of my life to know that if I wanted to get a higher education, I would be able to. Thanks, Mom and Dad. It's hard for me to imagine working hard in middle school and high school if I didn't have that goal in mind, but that's the path that many grow up with.

    The above graph are the results of a study by the Department of Education started in 1988. It shows that low-income students are most likely not to complete college - despite doing well in 8th grade. It's a much different story for high-income students.

    The Department tracked student progress in 8th grade and through high school and college over the next 12 years. Only 3% of students, from low income families, with low 8th grade math performance, completed college. Compare that to students with the same math performance but from high income families. Thirty percent finished college. That's ten times more than the former.

    What's worse is that many low-income students who had high math performance still didn't complete college. The percentage of college completion for low-income, high math students was still lower than high-income, low math students.

    [via @golan]

  • Data is the New Hot, Drop-dead Gorgeous Field

    August 7, 2009  |  Statistics

    We all know this already, but it's nice to get some backing from The New York Times every now and then. In this NYT article, that I'm sure has spread to every statistician's email inbox by now, Steve Lohr describes the dead sexy that is statistics:

    The rising stature of statisticians, who can earn $125,000 at top companies in their first year after getting a doctorate, is a byproduct of the recent explosion of digital data. In field after field, computing and the Web are creating new realms of data to explore sensor signals, surveillance tapes, social network chatter, public records and more. And the digital data surge only promises to accelerate, rising fivefold by 2012, according to a projection by IDC, a research firm.

    I've got about one more year (hopefully) until I finish graduate school. Hmm, things are looking up, yeah? Of course, it's never been about the money. The profession of statistician didn't nearly seem so hot when I started school. The best news here is that us data folk are going to get paid for doing what we enjoy, and as time goes on there's only going to be more data to play with, and we're going to be in high demand:

    Yet data is merely the raw material of knowledge. "We're rapidly entering a world where everything can be monitored and measured," said Erik Brynjolfsson, an economist and director of the Massachusetts Institute of Technology's Center for Digital Business. "But the big problem is going to be the ability of humans to use, analyze and make sense of the data."

    Wait, but it's not just statisticians who can interpret data:

    Though at the fore, statisticians are only a small part of an army of experts using modern statistical techniques for data analysis. Computing and numerical skills, experts say, matter far more than degrees. So the new data sleuths come from backgrounds like economics, computer science and mathematics.

    Like a... data scientist? Excellent.

  • IT Dashboard and Data from USAspending.gov

    July 22, 2009  |  Data Sources

    it-dashboard

    Taking another step towards data transparency, the US government provides the IT dashboard via USAspending.gov:

    The IT Dashboard provides the public with an online window into the details of Federal information technology investments and provides users with the ability to track the progress of investments over time. The IT Dashboard displays data received from agency reports to the Office of Management and Budget (OMB), including general information on over 7,000 Federal IT investments and detailed data for nearly 800 of those investments that agencies classify as "major." The performance data used to track the 800 major IT investments is based on milestone information displayed in agency reports to OMB called "Exhibit 300s." Agency CIOs are responsible for evaluating and updating select data on a monthly basis, which is accomplished through interfaces provided on the website.

    Along with a page to filter and download spending data, there's a variety of views into the IT spending data that all provide a pretty good level of interaction.
    Continue Reading

  • Taking a Closer Look at Airplane-Bird Collisions

    July 16, 2009  |  Data Sources

    While we're on the subject of flight, ever since that plane landed in the Hudson River a few months ago, the thought of bird-airplane collisions haven't strayed too far from the media (or my mind each time I fly). In light of all the hoopla, the Federal Aviation Administration (FAA) finally gave in and opened up their bird strike database to the public.

    Below is an interactive exploring this data breaking things down by bird type, location, phase of flight, and time of day. Click through to this post to view.
    Continue Reading

  • Explore World Data with Factbook eXplorer from OECD

    explorer

    The Organization for Economic Co-operation and Development (OECD) makes a lot of world indicators available (e.g. world population and birth rate). Much of it goes unnoticed, because most people just see a bunch of numbers. However, the Factbook eXplorer from the OECD, in collaboration with the National Center for Visual Analytics, is a visualization tool that helps you see and explore the data.

    Those who have seen Hans Rosling's Gapminder presentations - and I imagine most of us have - will recognize the style with a play button and a motion graph in sync with parallel coordinates and a map. Choose an indicator, or several of them, press play, and watch the visualization move through time.

    Also, if you've got your own data, you can load that too, which is certainly a nice touch.

    [via BBC News | Thanks, Lawrie & Liam]

  • The Devil is in the Digits?

    June 22, 2009  |  Statistics
    digits
    Photo by Leo Reynolds

    Undoubtedly you've been seeing a lot of headlines about the stuff going on in Iran. If you haven't, you must be living under a rock.

    One of the huge issues right now is whether or not fraud was involved in the election of Mahmoud Ahmadinejad.

    Wait a minute. Voting? Results? Numbers?

    Oh, we have to look at the data for this one. Bernd Beber and Alexandra Scacco, Ph.D. candidates in political science at Columbia University, discuss in their Op-ed for the Washington Post:

    The numbers look suspicious. We find too many 7s and not enough 5s in the last digit. We expect each digit (0, 1, 2, and so on) to appear at the end of 10 percent of the vote counts. But in Iran's provincial results, the digit 7 appears 17 percent of the time, and only 4 percent of the results end in the number 5. Two such departures from the average -- a spike of 17 percent or more in one digit and a drop to 4 percent or less in another -- are extremely unlikely. Fewer than four in a hundred non-fraudulent elections would produce such numbers.

    Why does this matter? Well humans are bad at making up sequences of numbers. Made-up number sequences look different from real random sequences (e.g. numbers from McCain/Obama). Beber and Scacco go on to describe the details of why the data look fishy. For those of us who've read Freakonomics will recognize the discussion.

    The result?

    The probability that a fair election would produce both too few non-adjacent digits and the suspicious deviations in last-digit frequencies described earlier is less than .005. In other words, a bet that the numbers are clean is a one in two-hundred long shot.

    Now what?

    [via Statistical Modeling]

  • The Current State of Social Data

    June 16, 2009  |  Social Data Analysis

    Check out my guest post on The Guardian's Data Blog on the current state of social data applications. There are what seems like a ton of them but none of them have really taken off (yet).

    While the post is more of an overview of what's available, I'd like to start a little discussion here on why these data apps haven't gained more popularlity. There always seems be a lot of buzz around launch time, but then it fizzles.

    Are people just not interested in interacting with data or do we need to approach the whole social data puzzle from a different angle?

  • Poll: Will Data Always Be Just For Geeks?

    June 10, 2009  |  Polls, Statistics
    geek
    Photo by penmachine

    I threw out a random thought a couple of months back. I tweeted, "Remember when computers used to be just for geeks? Now they're ubiquitous. We can do the same for data."

    To be honest, I was just babbling, but I've been giving it some thought, and you know, now I'm not so sure. There are so many applications popping up every day that promise to socialize data. To make it the YouTube of data. None of them have really taken off though.

    Is it because the visualization tools aren't advanced enough to make data accessible to the common user or is data simply meant to stay in the hands of experts?

    So this begs the question:

    {democracy:9}

    If yes, what do you think makes data so distant to non-experts? If no, what will it take for non-experts to start interacting with data? Or are they already?

  • Rise of the Data Scientist

    June 4, 2009  |  Design, Statistics

    Photo by majamarko

    As we've all read by now, Google's chief economist Hal Varian commented in January that the next sexy job in the next 10 years would be statisticians. Obviously, I whole-heartedly agree. Heck, I'd go a step further and say they're sexy now - mentally and physically.

    However, if you went on to read the rest of Varian's interview, you'd know that by statisticians, he actually meant it as a general title for someone who is able to extract information from large datasets and then present something of use to non-data experts.
    Continue Reading

  • What’s Wrong With this Graphic on the Future of Information?

    June 1, 2009  |  Discussion, Mistaken Data

    Market flow

    This graphic on the history and future of information has been making the rounds. Several people sent it to me a while back, but it didn't seem quite right, so I didn't post it; however, this post from PZ Meyers compelled me to take another look. Meyers says:

    Some days, I think other people must be aliens. Or I must be. For instance, there's a lot of noise right now about this article analyzing the future of information and media that, if you read the comments, you will discover that people are praising to an astonishing degree. I looked at it and saw this graph [above graphic]. And my bullshit detector went insane. It's supposed to be saying something about where people are and will be getting their information, but there's no information about where this information came from, and it's meaningless!

    Yikes. Take out the boxing gloves. Looks like we've got another clash between the technical and the design-ish and mainstream crowds. The comments from both sides are also pretty interesting with one group saying how visually appealing and informative the graphic is with the other group criticizing the graphic for failing in every way.

    Good or Bad?

    Clearly the graphic is not based on any real data or metric. It goes off history and probably a lot of Wikipedia entries, and then shapes and sizes go off feeling. So as an analytical graph, it doesn't work. But what about as an opinion in graph form? Does it work then? What do you think? Is this graphic a crime against all that is good in visualization or does it work for what it was trying to do?

    [Thanks, Patrick]

  • Data.gov is Live – Get Your Data While it’s Hot

    May 21, 2009  |  Data Sources

    Big news. Data.gov is now live. Government data is at your fingertips.

    The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. Although the initial launch of Data.gov provides a limited portion of the rich variety of Federal datasets presently available, we invite you to actively participate in shaping the future of Data.gov by suggesting additional datasets and site enhancements to provide seamless access and use of your Federal data. Visit today with us, but come back often. With your help, Data.gov will continue to grow and change in the weeks, months, and years ahead.

    I was actually expecting an API of some sort, but it's a searchable catalog that makes it easier to find the datasets scattered across all the U.S. agency sites. I still need to explore more to figure out what exactly is there, but this is big news for data fans. What do you think of the new site? Discuss in the comments below.

    [via infosthetics]

  • 37 Data-ish Blogs You Should Know About

    May 6, 2009  |  Statistics, Visualization

    You might not know it, but there are actually a ton of data and visualization blogs out there. I'm a bit of a feed addict subscribing to just about anything with a chart or a mention of statistics on it (and naturally have to do some feed-cleaning every now and then). In a follow up to my short list last year, here are the data-ish blogs, some old and some new, that continue to post interesting stuff. Continue Reading

  • Google Adds Search to Public Data

    April 28, 2009  |  Data Sources, Online Applications

    Google announced today that they have made a small subset of public datasets searchable. Search for unemployment rate and you'll see a thumbnail at the top of the results. Click on it, and you get a the very Google-y chart like the one above, so instead of searching for unemployment rates for multiple years, you can get it all at once.
    Continue Reading

  • Tracking Swine Flu Worldwide – Where and How, Plus Data

    April 28, 2009  |  Data Sources, Infographics

    Just about everywhere you go there's something in the news about swine flu, and so naturally, when I first heard about it, I waited for The New York Times to put up a graphic. That was the first one. Here's the second (above).
    Continue Reading

  • Narrow-minded Data Visualization

    April 22, 2009  |  Statistics

    I was going to let this one slide, but people kept commenting, essentially trashing FlowingData, and that's just not cool. As you might recall, I put in my picks for the best data visualization projects of 2008 a while back. They were the fine work of statisticians, designers, and computer scientists, all of them beautiful, and all of them built to tell an interesting story with the dataset at hand. None of them were traditional graphs or charts.
    Continue Reading

  • Millions of Money-in-Politics Data Records Now Available

    April 15, 2009  |  Data Sources

    The Center for Responsive Politics (CRP), a research group well-known for its tracking of monetary influence on United States politics, announced some great news. Their expansive dataset is now available to the public via OpenSecrets.

    Politicians, prepare yourselves. Lobbyists, look out. Today the nonpartisan Center for Responsive Politics is putting 200 million data records from the watchdog group's archive directly into the hands of citizens, activists, journalists and anyone else interested in following the money in U.S. politics.

    Yeah, 200 million data records. Correction. 200 million cleaned, formatted, and documented data records. Awesome. They've got data on campaign finances, lobbying, personal finances, and 527 organizations, which can be downloaded as CSV files or via the RESTful API. Let the mashups begin.

    [via Ben Fry | Thanks, Gegtik]

  • Taking a Look at Facebook Statistics from All Facebook

    March 24, 2009  |  Data Sources

    facebook

    Facebook started as a spinoff of Hot or Not in 2003. Now Facebook is the world's biggest online social network. It's certainly come a long way with millions of users around the world, the opening of the Facebook Platform, and quite possibly a personal data gold mine. All Facebook, the unofficial Facebook resource, provides news, and more importantly, data on growth, demographics, pages, and applications. A lot of it is locked behind a not so pretty widget, but interesting nevertheless. The above graphic is a look at some of that data.

    [Thanks, @mobiletek]

  • Data Visualization is Only Part of the Answer to Big Data

    March 20, 2009  |  Design, Exploratory Data Analysis

    How can we now cope with a large amount of data and still do a thorough job of analysis so that we don't miss the Nobel Prize?

    — Bill Cleveland, Getting Past the Pie Chart, SEED Magazine, 2.18.2009

    For the past year, I've been slowly drifting off my statistical roots - more interested in design and aesthetics than in whether or not a particular graphic works or the more numeric tools at my disposal. I've always had more fun experimenting on a bunch different things rather than really knuckling down on a particular problem. This works for a lot of things - like online musings - but you miss a lot of the important technical points in the process, so I've been (slowly) working my way back to the analytical side of the river.
    Continue Reading

  • What’s Wrong With this Financial Bubble Chart?

    February 26, 2009  |  Mistaken Data

    Average US Consumer Spending Bubble Infographic

    If there's anything good that has come out of America's financial crisis, it's the interesting and high-quality infographics. This isn't one of them. Below is an ill-conceived bubble chart from BillShrink that "shows" average U.S. consumer spending. Notice anything wrong with it?

    Bar versus bubble debate aside, there is a ton of room for improvement as well as huge need for some fact-checking and common sense. For a blog on a site for personal finance, the graphic is, well, not something to be proud of. FlowingData readers know that I like to stay away from heavy-handed critique on what works and what doesn't (I leave that to you guys), but this BillShrink graphic is just so clearly confusing that it's worth pointing out what doesn't work so we can learn from others' mistakes. Can you find the flaws?

    [Thanks, Jess]

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.