• Speed Dating Data – Attractiveness, Sincerity, Intelligence, Hobbies

    February 6, 2008  |  Data Sources

    In their paper Gender Differences in Mate Selection: Evidence from a Speed Dating Experiment, Fisman et al. had a bit of fun with a speed dating dataset. Here's what they found:

    Women put greater weight on the intelligence and the race of partner, while men respond more to physical attractiveness. Moreover, men do not value women's intelligence or ambition when it exceeds their own. Also, we find that women exhibit a preference for men who grew up in affl­uent neighborhoods. Finally, male selectivity is invariant to group size, while female selectivity is strongly increasing in group size.

    The dataset is substantial with over 8,000 observations for answers to twenty something survey questions. With questions like How do you measure up? and What do you look for in the opposite sex?, this dataset is definitely high on human element and should be fun to play with.

    [via Statistical Modeling]

  • Data Makes Reasonable Decision-making Possible

    February 6, 2008  |  Miscellaneous

    What is data?This guest post is by Andrew Gelman from Statistical Modeling, Causal Inference, and Social Science. He answers the question - "What is data and why should we care about it?"

    Good data are better than bad data, but worst of all are data whose quality you can't assess. Beyond this, we want to use statistical methods that allow us to combine data from many sources. I'm comfortable with regression and multilevel models, but other methods are out there too. In any case, we have to care about our data because inferences and decisions are just about always data-based, implicitly if not explicitly. Being the person in the room with the hard data gives you authority, as well it should.

  • Tap Into the Wisdom of Crowds, Make Money by Predicting Future Events

    February 5, 2008  |  Social Data Analysis

    Predictify LogoPredictify takes James Surowiecki's The Wisdom of Crowds to heart. Surowiecki argues that when certain factors are present (for example, group diversity), then the group is always smarter than the individual. Predictify has turned this "principle" into a money-making platform.
    Continue Reading

  • May the Data Be With You, Young Skywalker

    February 4, 2008  |  Miscellaneous

    What is data?In response to my question, "What is data and why should we care about it?" - Zach Gemignani from Juice Analytics answered:

    Obi-Wan Kenobi could have been speaking about data in businesses when he said: "It's an energy field created by all living things. It surrounds us, and penetrates us. It binds the galaxy together."

    Data is the residue of every action and interaction that takes place in a company, with customers, and in the marketplace. Businesses have created complicated and effective nets to capture this data as it flies off in all directions. Unfortunately, mountains of data mean nothing. Like young Luke Skywalker's inability to control The Force, a company's inability to make use of data is nothing more than frustration and untapped potential.

    Making use of data takes a subtle combination of capabilities. It takes experience and context about the business, speed and skill to manipulate data, and an ability to visualize and communicate results. Data in the wrong hands is useless if not dangerous; in the right hands data can transform into new insights and informed decisions.

  • What is Data and Why Do We Care About it So Much?

    February 4, 2008  |  Miscellaneous

    What is Data and Why Should We Care About It?I've been fortunate to have worked with people from lots of different fields - statistics, ecology, computer science, engineering, design, etc. If I've learned anything, it's that everyone has a different idea of what data is and why it matters.

    I've found that until I've understood what my collaborators mean by data and what they (and me) are trying to get out of a dataset, it's near impossible to get anything useful done.

    To make things a bit more clear (and for my own enjoyment), I asked a select group of people a single question:

    What is data and why should we care about it?

    Those who responded are from different areas of expertise, ranging from statistics, to business, to computer science, to design. Some names you'll recognize while others will be new to you. All are doing interesting things with data.

    I've been looking forward to this series for a couple of weeks now, and my hope is that you will gain a better understanding about what data is and how people are putting it to use. Keep an eye out for posts with the black square image above.

    Here is who has answered so far:

    If you'd like to answer the question yourself, I'd love to see your response too, or if you write an answer on your own blog, please do post the link in the comments below.

  • Who’s Going to Win Super Bowl XLII?

    February 3, 2008  |  Statistics

    I just put down $20 on today's game for the New York Giants to cover the 12-point spread. Of course, knowing me, I got to thinking how that betting line is decided. Is there one person who calculates the spread? Do Las Vegas casinos just put up numbers based on past experiences? I did a little bit of research, and here's what I found.
    Continue Reading

  • Weekend Minis – Government, Environment & Angry Employee

    February 2, 2008  |  Data Sources

    FedStats - Provides access to the full range of official statistical information produced by the Federal Government, including population, eduction, crime, and health care.

    MAPLight - A detailed database that brings together information on campaign contributions and votes in the California legislature. Check out the video tour.

    EarthTrends - A collection of information regarding the environmental, social, and economic trends that shape our world.

    Angry Employee Deletes All of Company's Data - A woman about to "lose" her job goes to the office at night and deletes 7 years' worth of data. Can we say backup, please?

  • Bad Statistics Leads to Poor Results and a Questionable Trial Verdict

    February 1, 2008  |  Mistaken Data

    Peter Donnelly talks about the misuse of statistics in his TED talk a couple of years back. The first 2/3 of the talk is an introduction to probability and its role in genetics, which admittedly, didn't get much of my interest. The last third, however, gets a lot more interesting.

    Donnelly talks about a British woman who was wrongly convicted largely in part because of a misuse of statistics. A so-called expert cited how improbable it would be for two children to die of sudden infant death syndrome, but it turns out that "expert" was making incorrect assumptions about the data. This doesn't surprise me since it happens all the time.

    Lesson Learned

    People misuse statistics every day (intentionally and unintentionally), and oftentimes it doesn't hurt much (which doesn't make it any better), but in this case improper use directly affected someone's life in a very big way. One of the most common assumptions I see is that every observation is independent, which often is not the case. As a simple example, if it's raining today, does that change the probability that it will rain tomorrow? What it didn't rain today?

    In other words, the next time you're thinking of making up or tweaking data, don't; and the next time you need to analyze some data but aren't sure how, ask for some help. Statisticians are nice and oh so awesome.

    Here's Donnelly's talk:

  • NSF Science and Engineering Visualization Challenge

    January 31, 2008  |  Visualization

    The National Science Foundation is running their annual Science and Engineering Visualization Challenge.

    Some of science’s most powerful statements are not made in words. From the diagrams of DaVinci to Hooke’s microscopic bestiary, the beaks of Darwin’s finches, Rosalind Franklin’s x-rays or the latest photographic marvels retrieved from the remotest galactic outback, visualization of research has a long and literally illustrious history. To illustrate is, etymologically and actually, to enlighten.

    You can do science without graphics. But it’s very difficult to communicate it in the absence of pictures. Indeed, some insights can only be made widely comprehensible as images. How many people would have heard of fractal geometry or the double helix or solar flares or synaptic morphology or the cosmic microwave background, if they had been described solely in words?

    To the general public, whose support sustains the global research enterprise, these and scores of other indispensable concepts exist chiefly as images. They become part of the essential iconic lexicon. And they serve as a source of excitement and motivation for the next generation of researchers.

    They've been accepting submissions since September of last year and will continue to do so until May 31, 2008. The rules are pretty wide open with last year's winners in the area of photography, illustration, and interactive and non-interactive media. Basically, it's whatever you want it to be. The winners will be published in the the journal Science, and one of the winning submissions will get to be on the cover of the prestigious journal.

  • Journal of Quantitative Analysis in Sports is Live

    January 30, 2008  |  Statistics

    basketball-rounded

    Whenever I tell people that I study Statistics, they almost always respond, "So what do you do with that?" After they get over their initial shock, I often get, "If I were in Statistics, I'd study sports statistics." I usually respond by telling them that while it would probably be a lot of fun, I don't think there is much money in it (because I gotta eat, right?) and that statisticians usually take that as a part time gig. I'm thinking I might have to change that response though, as the game of sports statistics is showing signs of life with the recent Journal of Quantitative Analysis in Sports.

    Articles in the Journal of Quantitative Analysis in Sports (JQAS) come from a wide variety of sports and perspectives and deal with such subjects as tournament structure, frequency and occurrence of records and the optimal focus of training for decathlons. Additionally, the journal serves as an outlet for professionals in the sports world to raise issues and ask questions that relate to quantitative sports analysis. Edited by economist Benjamin Alamar, articles come from a diverse set of disciplines including statistics, operations research, economics, psychology, sports management and business.

    Maybe I'll read regularly and take up sports betting as my new hobby.

  • A Chat with The New York Times on Making Data More Engaging

    January 29, 2008  |  Design

    Jared Pool had a chat with Andrew (multimedia) and Steve (graphics) at The New York Times. I'm sure you're familiar with their work. They chat about the design process of the interactive pieces on The Times site like the transcript analyzer, the home run chart, and plenty of other specific examples. They also go into a bit about where they get inspiration from (e.g. old Fortune magazines, photographs, advertisements) as well as how they go about creating their more innovative pieces.

    Keep in mind it's on the User Interface Engineering blog, so it's mostly focused on, well, the user interaction and design and less on where data comes from, the journalistic process, etc, but still, it's a pretty good listen.

    [via Visual Methods]

  • Visualization of Smiling Faces – Microsoft Live / Operation Smile

    January 28, 2008  |  Data Art

    For the re-launch of the Microsoft Windows Live platform, Firstborn created a generative art installation taking thousands of smiling faces and placing them into a 3-D world. It was an outdoor installation (done in Processing) projected on a seven-story sphere, and I am sure it wowed a whole lot of people. It's definitely amazing me, and all I'm seeing are screenshots and a demo.

    Continue Reading

  • Weekend Minis – Maps, Motion & Resources

    January 26, 2008  |  Visualization

    Interactive Travel Time and House Price Maps - Tom from Stamen recently announced some really slick mapping. They're very attractive and very responsive. Sidenote: Look forward to a guest post from Tom in the near future.

    175+ Data and Information Visualization Examples and Resources - Meryl has posted an extensive list of visualization examples and resources available online. Thanks for linking here, Meryl!

    GPSed - A site that takes advantage of the data available from your mobile phone, mainly pictures and your GPS trace.

    Visualizing the History of Living Spaces - Ivanov et al. discuss the challenges of visualizing motion data from 215 motions sensors in a large office building.

  • Books that Make You Dumb (Not Really)

    January 26, 2008  |  Ugly Charts

    booksthatmakeyoudumblarge

    Virgil Griffith has created a series of graphs called Books that Make You Dumb. He correlates top books on FaceBook by school and the corresponding schools' average SAT scores. Notice Freakonomics is pretty far to the right. Nice.

    The graphs are of course aren't really that statistical nor are they especially beautiful, but hey, just take it for what is it, and it's kind of amusing. Plus, it's a good example of how you can use data from different sources to find something interesting.

  • 6 Influential Datasets That Changed the Way We Think

    January 24, 2008  |  Visualization

    The thing about data is that it can be very convincing. Maybe it's because it's so hard to argue against numbers, or maybe it's just that there's so much of it. In any case, here's six datasets that undoubtedly changed the way some people behave or showed us something that brought about a different way of thinking about things. Continue Reading

  • Walker Tracker – A Community Site for Pedometer Fans

    January 23, 2008  |  Data Sharing

    Those of you who have been around since the beginning know that I am just obsessed with my pedometer. Albeit, lately, I haven't felt inclined to go for a winter stroll in the below freezing weather. When I was keeping track of my steps though, one of the difficulties was staying consistent. Sometimes I would forget to wear my pedometer, while other times I would forget to record my steps.

    I imagine Walker Tracker could help a bit in solving that second problem. I know it was always easier to make it to the gym when I knew one of my friends was going to meet me there. Walker Tracker is like that friend at the gym. The site lets you keep track of your steps as well as see how others are doing.

    We're trying to change the world. We're trying to get you and us and everyone we know off the elevator and out of the car and onto the sidewalks and trails. We're doing it one step at a time.
    Get up, stand up and walk.

    OK, maybe it's a little hoorah, but if you feel like actually accomplishing a new year's resolution this year, Walker Tracker could be a good place to start.

    [via Web Worker Daily]

  • How a Trip to the Dentist Got Me Thinking About Open Data

    January 22, 2008  |  Visualization

    Warning: Tangent ahead, but I promise, there's a point.

    About a year ago, I went to my 6-month teeth checkup, and the dentist told me that I had a cavity on the bottom back left and another on the bottom back right. Since I was about two years overdue for a checkup (and didn't floss every day), I wasn't surprised.

    One week later, I was back to get my fillings. I sat down in that terrifying chair that looks like something aliens use to probe specimens. The drilling began.

    My teeth are really sensitive, so no matter how many shots of novocaine she injected (3 or 4), I still felt pain. Here's how it went with the first filling. She drilled. I winced. She stopped. We took a short 1-minute break. She drilled. I winced. We took a break.

    We went on like that for about 20 minutes -- all the while she kept telling me it was a tiny cavity and that it shouldn't hurt. Yeah, OK, whatever. Maybe if she actually stuck the needle in the nerve and not just some random place in my gums, it would have worked.

    Anyways, she finally finished and suggested we put off the second filling until the next visit in six months. I thought to myself, "Uh, won't my cavity just get worse in 6 months??" I was in enough pain already though (with beads of sweat to prove it) so I agreed despite my concerns.

    I ended up missing that next appointment.
    Continue Reading

  • Google Decides to Host a Whole Lot of Scientific Data – Palimpsest Project

    January 21, 2008  |  Data Sources

    Google ResearchIn its continued efforts for absolute power over all information ever created in the world, Google will be hosting open-source scientific datasets at its research section. Here are the presentation slides from Google's Jon Trowbridge:

    In the next few weeks, terabytes of data will be made available to the public. For example, all 120 terabytes of Hubble Space Telescope data is going to be online. That's kind of cool but kind of scary too. Such a large amount of data is bound to affect lots of people on many different levels.

    For scientists, data will be available for deeper research. For the scientists who generated the data, their research could be placed under more critical scrutiny. Existing data applications might be eclipsed by the data giant, or it could go the other way such that the general public grows more aware of data-type things. Mashups will in turn spring up as well as more visualization, I am sure.

    All of this Doesn't Matter If...

    Of course, all of this depends on what data end up on the Google servers and how easily accessible the data are. Knowing Google, I don't think accessibility will be a problem. Getting data will be the super hard part. Who will be willing to contribute their data? What type of data will get contributed? Will it be the good, raw data or more cleaned and processed data? Do researchers even want to share their data with the rest of the world?

    It's going to be interesting to see what goes up on Google Research in these coming weeks.

    [via Wired and Pimm]

  • Mapping Google Access Data from (suit)men

    January 18, 2008  |  Mapping

    There's a nice real-time (?) map on (suit)men Entertainment. Click the black rectangle on the bottom left-hand corner to see the entire map. Supposedly the map is powered by Google, so I want to say it's showing search data or something of that sort. To be honest though, I have no clue.

    Whenever a number pops up, there's a line that connects some country to Japan (the site's origin), so I'm guessing they're mapping something like accesses to the (suit)men site from whatever country. Oh well, no matter. Look how pretty. It's entertainment, and it managed to entertain me for a good few minutes (which says alot with my short attention span :). Does anyone know what they're showing?

    [via Simple Complexity]

  • Iraq Body Count: A Human Security Project

    January 17, 2008  |  Data Sources

    Iraq Body CountIraq Body Count keeps track of civilian deaths by cross checking media reports and hospital, morgue, and NGO figures. Along with a widget counter that you can post on your blog or site, IBC also makes their database available for download.

    Systematically extracted details about deadly incidents and the individuals killed in them are stored with every entry in the database. The minimum details always extracted are the number killed, where, and when.

    The data comes in two sets -- incident reports and individuals who have lost their lives -- in the form of CSV files.

    Albeit, the data is a little depressing, but still very necessary.

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.