Category: Statistics

  • Tim Berners-Lee with an update on open data

    Posted Mar 15, 2010 to Data Sharing / 1 comment

    If people put data on the Web - government data, scientific data, community data - whatever it is, it will be used by other people to do wonderful things in ways they never could have imagined.
    Tim Berners-Lee, TED, February 2010

    Tim Berners-Lee, credited with inventing the World Wide Web, comes back to TED a year after his call for open, structured data with a quick update. Spoiler alert: things are looking good - and they're only going to get a lot better. But you already knew that, right?

    [via infosthetics]

  • Is Jeff Bridges most likely to win best actor?

    Posted Mar 7, 2010 to Statistics / 8 comments

    There's this article on CNN, from The Frisky, that has this little theory about who is most likely to win the Oscar for best actor:

    [T]he Oscar generally goes to the dude who has the most best actor and best supporting nominations under his belt already.

    That seemed like a curious statement. Didn't Forest Whitaker, Philip Seymour Hoffman, and Jaimie Foxx recently win on their first nominations for the coveted award? Okay, so Hoffman was actually up against a bunch of other newbies, but what about the rest?

    Only 10 out of the past 29 winners, or just over a third, had the most nominations their year. Take a look at the data since 1980. Is the theory valid? You decide.

    Of course when Jeff Bridges wins tonight, the theory authors will declare victory, but oh well.

    Just for fun let's take a poll:

    Who will win the Oscar for best actor?
    View Results
  • Think like a statistician – without the math

    Posted Mar 4, 2010 to Data Design Tips, Statistics / 52 comments

    Think like a statistician – without the math

    I call myself a statistician, because, well, I'm a statistics graduate student. However, ask me specific questions about hypothesis tests or required sampling size, and my answer probably won't be very good.

    The other day I was trying to think of the last time I did an actual hypothesis test or formal analysis. I couldn't remember. I actually had to dig up old course listings to figure out when it was. It was four years ago during my first year of graduate school. I did well in those courses, and I'm confident I could do that stuff with a quick refresher, but it's a no go off the cuff. It's just not something I do regularly.

    Instead, the most important things I've learned are less formal, but have proven extremely useful when working/playing with data. Here they are in no particular order.
    Continue Reading

  • Spirit of graph and dance is alive

    Posted Feb 24, 2010 to Statistics / 3 comments

    A good portion of my time in high school was spent trying to get into college. The rest of the time I was trying to look cool while doing it. Now of course I know better and fully embrace the inner geek. I'll never know what life would've been like had I thrown caution to the wind back then, but I'm guessing it would've been something like this.

    Amelia Downs, the girl in the video above, sent this in with her college application to Tufts University. The university encouraged applicants to submit short YouTube videos.

    Hopefully Amelia gets in, but if she doesn't, at least she got her 15 minutes: New York Times, Boston Globe, and ABC. Also, she can have the satisfaction in knowing she started a worldwide phenomenon called the scatter plot.

    [via infosthetics]

  • Get a Date With Your Online Profile Pic – Myths Debunked

    Posted Feb 10, 2010 to Statistics / Add your comment

    The online dating world can be a confusing place. How do you interact with others? Who should contact? What should you say about yourself? There are a lot of decisions to make, but it all starts with your profile picture when it comes to grabbing the attention of potential dates. Online dating site, OkCupid, analyzed over 7,000 profile pictures, debunking four myths:

    1. It's better to smile
    2. You shouldn’t take your picture with your phone or webcam
    3. Guys should keep their shirts on
    4. Make sure your face is showing

    Some of the results are pretty surprising. For example, men's photos were most effective when they weren't looking at the camera and not smiling:

    It was the opposite for women. A flirty face or smiling while looking at the camera showed most effective:

    Catch the full analysis here.

    [Thanks, Tom]

  • Data.gov.uk versus Data.gov – Which wins?

    Posted Feb 4, 2010 to Data Sources, Reviews / 22 comments

    Back in May last year, the US government launched Data.gov as a statement of transparency, and the Internet rejoiced. After the launch, excitement kind of fizzled with the actual Data.gov site, but big cities like San Francisco, New York, and Toronto got in on the open data party.

    Then just a couple of weeks ago, Data.gov.uk launched, which brought me back to the US counterpart. How do the two compare? Here's my take. Continue Reading

  • Understanding risk – play it safe or eat a bacon sandwich?

    Posted Jan 27, 2010 to Statistics / 7 comments

    Understanding risk – play it safe or eat a bacon sandwich?

    David Spiegelhalter is a Professor of the Public Understanding of Risk at Cambridge University. He studies the choices we make, and how those choices can have an effect later on.
    Continue Reading

  • Data.gov.uk Gearing Up For Launch, er, Does Launch

    Posted Jan 20, 2010 to Data Sources, Mapping / 2 comments

    Data.gov.uk Gearing Up For Launch, er, Does Launch

    Update: I had scheduled this post for next week, but apparently, Data.gov.uk launched today. The site isn't loading for me right now though. I guess they weren't prepared for traffic.

    Data.gov, a catalog of US data, launched last year. Now it's the UK's turn. Well, not yet. But soon. Data.gov.uk is still under lock and key, but it has granted access to some developers. Ito Labs, the group behind mapping a year of OpenStreetMap edits posted screenshots of their maps that show vehicle counts (above).

    Here are some comparison maps between 2001 and 2008, by vehicle type.

    Once Data.gov.uk is up, it'll be interesting to see how it compares to its US counterpart. Even more interesting will be the projects that come out of it.

    Despite all the broohaha over Data.gov, not many useful projects (or datasets) come to mind. Can you think of any? There's still a long way to go from both sides of government and developer.

    [Thanks, Oliver]

  • Virtual Slot Machine Teaches the Logic of Loss

    Posted Dec 18, 2009 to Infographics, Statistics / 9 comments

    Virtual Slot Machine Teaches the Logic of Loss

    This interactive by Las Vegas Sun describes how in the long run, you're going to lose every single penny when you throw your hard-earned money into a slot machine. In the short-term though, it is possible to win. It's all probability. It's also why statisticians don't gamble. Nobody plays a game that he's practically guaranteed to lose, unless you're a masochist - or you're Al Pacino in that one horrible sports gambling movie with Matthew McConaughey.

    One clarification on the snippet about payout percentage.

    Here's what the graphic reads:

    This is the ratio of money a player will get back to the amount of money he bets, which is programmed into the slot machine. If a machine has payout percentage of 90 percent, that means 90 percent of the money someone bets should statistically be won back. It means a player is not likely to lose 10 percent of the amount initially put into the machine, but rather 10 percent, on average, over time.

    The wording is kind of confusing. To be more clear - over time, on average, you'd lose 10% of the money you put in per bet. This is an important note, because it's how casinos make money. For example, when you play Blackjack perfectly (sans card-counting), you'll lose on average 2% (or something like that) per hand, so play long enough, and you're going to lose all your money.

    Imagine you have two buckets. One is filled with water. The other is empty. Transfer the water back and forth between the two buckets. Some of the water drips out during some of the transfers. Eventually, all the water is on the ground.

    Ah yes, intro probability is fun. Play the virtual slot machine and do some learning for yourself.

    [Thanks, Tyson]

  • Past 25 Years of Consumer Spending

    Posted Dec 2, 2009 to Data Sources, Statistics / 33 comments

    How has consumer spending changed over the past 25 years? Do we spend more on some things and spend less on other than we did in the early 80s? In this interactive, based on data from the Bureau of Labor Statistics, you can explore just that.

    spending
    Continue Reading

  • Fox News Makes the Best Pie Chart. Ever.

    Posted Nov 26, 2009 to Mistaken Data, Ugly Visualization / 89 comments

    Fox News Makes the Best Pie Chart. Ever.

    What? I don't see anything wrong with it.
    Continue Reading

  • Choose Your Own Adventure – Watch the Stories Unfold (Updated)

    Posted Nov 19, 2009 to Infographics, Statistics / 4 comments

    Choose Your Own Adventure – Watch the Stories Unfold (Updated)

    Interaction designer Christian Swinehart takes a careful look at the popular Choose Your Own Adventure books from the 1980s. We saw something like this before, but Swinehart takes it a step further.
    Continue Reading

  • Class Size and SAT Scores By State

    Posted Nov 10, 2009 to Statistics / 39 comments

    Are there any differences in student performance between schools with small classes (as in students per teacher) and those with large classes?

    The natural response is yeah, of course, because if there are less students per teacher, each student gets more individual attention from the teacher. Then again, I went to pretty big elementary and high schools where some classes were in the high thirties. It didn't seem all that bad.

    Students Per Teacher and SAT Scores

    Let's take a look at state-level student-teacher ratios and SAT scores to see if there was any difference. Click the image below for the full version.

    SAT Scores and Student-teacher Ratio

    From the picture above, it does look like there is a difference. States that score highest (highlighted in green) on the SAT on average tend to have lower student-teacher ratios. High-ratio states, however, have scores that hover around the national average.

    But there are also many states with ratios below the national average (small-ish classes) that score below the national average.

    Maine, for example, has the second lowest ratio in the country, but also averages some of the lowest scores. On the other hand, Utah has the highest ratio, but scores well above the national average on all SAT sections.

    What do you think? Does student-teacher ratio matter? Post your thoughts in the comments below.

  • Unemployment, 2004 to Present – The Country is Bleeding

    Posted Nov 4, 2009 to Data Sources, Featured, Mapping / 47 comments

    Unemployment, 2004 to Present – The Country is Bleeding

    The Bureau of Labor Statistics released the most recent unemployment numbers last week. Things aren't looking good for the unemployed, I'm afraid.

    I showed my younger sister the maps. Her response: "It looks like the country is bleeding."

    While the recession is "over" the unemployment rate rose to 9.8% in September from 9.7% in August. That's 214,000 more people who are jobless in the United States. The last time unemployment was this high was back in June 1983 when it was 10.1%.

    Check out the more detailed view here:

    Unemployment 2004 to present

    From 2004 to 2007, unemployment was actually decreasing, but things went sour in 2008, and we've been trying to bounce back ever since.

    Update: See the step-by-step tutorial on how you can make a map like this with your data.

  • Target Store Openings Since the First in 1962 – Data Now Available

    Posted Oct 22, 2009 to Data Sources / 7 comments

    Target Store Openings Since the First in 1962 – Data Now Available

    FlowingData readers who have been around for a while will remember I made a map early this year that showed the growth of Target stores across America. It starts with the first one in 1962 and then goes from there. It was a follow-up to the Walmart map, which I shared the code and data for.

    Anyways, I often still get emails about the Target data. I finally got around to asking if I could release it, and lucky for your the answer was yes. So here you are. Go wild.

    By the way, if anyone has similar data for Starbucks, let me know. There's gotta be at least one Starbucks analyst who reads this blog. Maybe?

    [Thanks, Cole]