• Open Thread: Is Google Latitude Dangerous?

    February 12, 2009  |  Data Sharing, Discussion, Online Applications

    Google recently released Google Latitude, which is an online application that lets you share your location with online friends:

    Of course when any application shares where you are at any given time, people start to feel like Big Brother is looming in the background ready to sneak up on us from behind a giant bush. Some call it a real danger, but is it really? I put this question out to all of you:

    Is Google Latitude a danger to anyone who uses it?

    My take on things is that people are already doing it anyways, so why not make it easier for those who are interested? Sure, if some stalker got a hold of your location, that could be bad, but that's true for a lot of data... credit card statements, cell phone logs, Twitter... As long as the proper security are put in place, I don't see what all the fuss is about.

  • Sensors in Footballs – Was the Pass Good?

    December 30, 2008  |  Statistics

    Graduate student researchers are pretty much putting sensors in everything these days. There's always more data to collect and more information to gather. Computer engineering students from Carnegie Mellon University experiment with sensors in footballs and gloves to measure grip, trajectory, speed and position.

    "You'd never want to replace the human referees because they make these calls based on years of experience, and no technology can replace that," she said. "But in addition to the instant replay, if you had a supplementary system that said this is exactly where the ball landed and where the player stopped with it, you could make these kinds of calls accurately."

    So far, she and her squad of undergraduate and graduate students have focused on two things: gloves with touch sensors that can transmit that information wirelessly to a computer, and a football equipped with a global positioning receiver and accelerometer that can track the location, speed and trajectory of the ball.

    Eventually, the same kind of sensors used in the gloves could be adapted to shoes, to measure stride and running patterns, or even shoulder pads, to calculate blocking positions and force.

    Yes, it's the end of the post-game show as we know it.

  • All You Can Eat at the Twitter Data Buffet

    December 24, 2008  |  Data Sources

    Philip from infochimps posts the results of some heavy Twitter scraping. Data for 2.7 million users, 10 million tweets, and 58 million edges (i.e. connections between users) to satisfy your data hunger are available for download. I know a lot of you social network researchers will especially appreciate the big dataset, and best of all, Twitter gave Philip permssion to release. Yes, you could use the Twitter API, but isn't it better when someone does it for you?

    Download the data here. The password is the Ramanujan taxicab number followed by the word
    'kennedy' - all one word. Google is your friend, if that doesn't make sense.

    [Thanks, Tim]

  • Do You Hate Statistics as Much as Everyone Else?

    December 15, 2008  |  Statistics

    Photo by Darwin Bell

    It happened again. I told someone I study statistics. He told me that he hated statistics in college. It doesn't annoy me like it used to - I've come to expect it - but why do so many people have this beef with stat? Is it really that boring? Confusing? What is it about statistics that turns people off? So I reach out to all of you:

    What is it that makes statistics so uninteresting?

    I'm going to assume that the icky factor is less for FlowingData readers (obviously), but still, I implore you - tell me why statistics sucks. I must know.

  • Scientists Can Now Map Your Dreams to an Image

    December 12, 2008  |  Statistics

    I thought this was a joke when I first read it, but scientists from Japan’s ATR Computational Neuroscience Laboratories have developed software that can map brain activity to an image. Subjects were shown letters from the word neuron and images were reconstructed and displayed on a computer screen.

    A spokesman at ATR Computational Neuroscience Laboratories said: "It was the first time in the world that it was possible to visualise what people see directly from the brain activity.

    "By applying this technology, it may become possible to record and replay subjective images that people perceive like dreams." The scientists, lead by chief researcher Yukiyaso Kamitani, focused on the image recognition procedures in the retina of the human eye.

    It is while looking at an object that the eye's retina is able to recognise an image, which is subsequently converted into electrical signals sent into the brain's visual cortex.

    The research investigated how electrical signals are captured and reconstructed into images, according to the study, which will be published in the US journal Neuron.

    I'm not sure how much brain activity from the retina has to do with activity during dreams, but it's interesting nevertheless (although I am sure - like all interesting science - it is slightly hyped by the media).

    [via Telegraph & Pink Tentacle & Chunici]

  • Amazon Gets In On the Public Data Arena

    December 5, 2008  |  Data Sources

    It was really only a matter of time, but Amazon now hosts public data sets. Not small data sets though - more like the ones in between 1 gigabyte and 1 terabyte:

    Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.

    Previously, large data sets such as the mapping of the Human Genome and the US Census data required hours or days to locate, download, customize, and analyze. Now, anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. For example, users can produce or use prebuilt server images with tools and applications to analyze the data sets. By hosting this important and useful data with cost-efficient services such as Amazon EC2, AWS hopes to provide researchers across a variety of disciplines and industries with tools to enable more innovation, more quickly.

    There's the human genome data set, US Census data from the past 3 decades, labor statistics, and some others. Still waiting on Google to follow through with their data hosting plans.

    [via TechCrunch | Thanks, David]

  • Guess What State Searches for ‘Poo’ the Most – StateStats

    December 5, 2008  |  Mapping, Statistics

    StateStats is like Google Insights but on a state level. Type in a search term and get Google search levels with correlations to certain "metrics" like obesity or support for Obama. Any Web application that uses correlation tends to make me feel a bit iffy, but it's just for fun, so I guess it's okay.

    Being the immature man-child that I am, the first thing I type in the search field is poo. I thought it was hilarious interesting that Louisiana's relative search rate was so much higher than all the other states. Apparently, obesity correlates moderately.

    I'm sure all of you will search for more sophisticated terms.

    [Thanks, @Chimp711]

  • Neighborhood Boundaries with Flickr Shapefiles

    November 28, 2008  |  Data Sources, Mapping

    Neighborhood Boundaries by Tom Taylor uses Flickr Shapefiles and Yahoo! Geoplanet "to show you where the world thinks its neighbors are." Yahoo! provides access to the Where on Earth (WOE) database, which attempts to describe locations as a hierarchy. For example - a town belongs to a city, a city to a county, a county to a state. The Flickr API stores shape files identified by the WOE ID. Here's the punchline. The shapefiles are built using only the latitude and longitude from geotagged photos on Flickr. There's no GIS involved here.

    Why this matters, I can't really say. I think it's mostly to show how much data is stored in geotagged Flickr photos. I'm no GIS expert though. Anyone care to comment on the significance?

    [Thanks, @couch]

  • US Oil Doesn’t Come From Where You Think it Does

    November 21, 2008  |  Data Sources, Mapping

    Where do you think the US imports the most oil from? Most of us would probably say somewhere in the Middle East, but Jon Udell does some number crunching and shows that misconception is false. Canada supplies us with the most oil (according to the US Department of Energy).

    This realization however, isn't the post's punchline. It's how easy it was for Jon to figure this stuff out. With some help from Dabble DB (an app that lets you easily use a database without too much technical fuss), Jon was able to parse the data and map it by region with a few swift clicks.

    We’re really close to the point where non-specialists will be able to find data online, ask questions of it, produce answers that bear on public policy issues, and share those answers online for review and discussion. A few more turns of the crank, and we’ll be there. And not a moment too soon.

    We're gettin' there.

    [Thanks, Tim]

  • New York Times Visualization Lab – Collaboration with Many Eyes

    October 28, 2008  |  Data Sources

    It was just a little over a week ago that The New York Times announced their Developer Network i.e. Campaign Finance API. Yesterday, they announced something more - the Visualization Lab. In collaboration with the Many Eyes group, the Times has rolled out a Many Eyes for data used by Times writers. You can visualize, explore, and comment on data posted at the Visualization Lab in the same way that you can at Many Eyes.

    Today, we’re taking the next step in reader involvement with the launch of The New York Times Visualization Lab, which allows readers to create compelling interactive charts, graphs, maps and other types of graphical presentations from data made available by Times editors. NYTimes.com readers can comment on the visualizations, share them with others in the form of widgets and images, and create topic hubs where people can collect visualizations and discuss specific subjects.

    A Few More Steps

    I said the API was a good step forward. The Visualization Lab is more than a step. No doubt The Times heard what I said about their API and decided to roll with it since I am the head authority on everything. Yes, I'm totally kidding, in case that didn't come across as a joke. Come on now.

    I'm looking forward to seeing how well Times readers take to this new way of interacting.

    [Thanks, William]

  • Playboy Playmate Curves and the State of the Economy

    October 24, 2008  |  Data Sources, Economics

    Terry Pettijohn and Brian Jungeberg of Mercyhurst College took a very close look at the curves, um, measurements of past Playboy Playmates of the Year in relation to the state of the economy.
    Continue Reading

  • Lexical Analysis of Presidential Debates and the Windbag Index

    October 23, 2008  |  Statistics

    lexical

    Martin Krzywinski, whose previous work includes Circos, digs deep into the presidential debate transcripts with tedious manual (or was it automatic?) annotation of words (noun/verb/adjective/adverb), Wordle, and his custom metric called the Windbag index that measures speech complexity.
    Continue Reading

  • Who’s Leading Whom? Predictive Markets Versus Polls

    October 22, 2008  |  Statistical Visualization, Statistics

    This is a guest post from Michael Drumheller, Dirk Karis, Raif Majeed and Robert Morton of Tableau Software. They use Tableau to explore the relationship between polls and predictive markets.

    Predictive markets such as Intrade and the Iowa Electronic Markets have attracted more attention this year than in past Presidential elections. Some political observers such as ElectoralMap.net look to these markets as indicators of who's winning or losing.
    Continue Reading

  • New York Times Rolls Out Campaign Finance API

    October 16, 2008  |  Data Sources

    The New York Times announced the opening of their Developer Network a couple of days ago. It's their "API clearinghouse and community." It might seem kind of weird that a newspaper company has an API, but as many FlowingData readers know, the Times prides itself on innovation.

    The Campaign Finance API is currently available:

    With the Campaign Finance API, you can retrieve contribution and expenditure data based on United States Federal Election Commission filings. Campaign finance data is public and is therefore available from a variety of sources, but the developers of the Times API have distilled the data into aggregates that answer most campaign finance questions. Instead of poring over monthly filings or searching a disclosure database, you can use the Times Campaign Finance API to quickly retrieve totals for a particular candidate, see aggregates by ZIP code or state, or get details on a particular donor.

    For anyone who has tried to play with FEC data, myself included, knows that this API is cool. You could get the data directly from the FEC, but it's a bit of a painstaking process. Now you don't have to sift through a bunch of reports or an awkward user interface.

    The Movie Review API is next in line. After that, who knows, but it's a good step forward for The Times.

    [via serial consign]

  • 3 Applications that Tap Into the Wisdom of Crowds

    September 30, 2008  |  Social Data Analysis

    crowd

    James Surowiecki writes in The Wisdom of Crowds that the group is smarter than the individual (under four conditions). Essentially, the premise is that if you get enough different people to work on a single problem independently, you're going to get as good or better results than that of a small group of experts working together. Think of it as advanced crowdsourcing.

    These three applications tap into the wisdom of crowds. It's clearly election season.
    Continue Reading

  • OneGeology Wants to Be Geological Equivalent of Google Maps

    September 11, 2008  |  Data Sources, Mapping

    There's lots of free geographical data about what's going on at the surface of our planet. It's a different story for what going on underneath though. OneGeology aims to be the solution to that problem.

    OneGeology is an international initiative of the geological surveys of the world and a flagship project of the 'International Year of Planet Earth'. Its aim is to create dynamic geological map data of the world available via the web. This will create a focus for accessing geological information for everyone.

    I've never been one for the geology, but if the data (and interactive maps) were easily accessible, there certainly would be a peak in interest.

    [via msnbc | Thanks, Samantha]

  • If You’re a Criminal on the Run, Don’t Use GPS

    July 11, 2008  |  Statistics

    With all the new technologies we've come to rely on, it's easy to forget just how much data we're automatically logging on our own devices or some central server in the boonies.

    GPS is one such example. Some of us can't imagine going out of town without it. What you might not know is that while that GPS device tells you where to turn left, it is also storing where you go in its memory. Scotland Yard has started using this data to solve crimes:

    Scotland Yard analysis of the [GPS] devices has helped solve dozens of investigations into kidnappings, grooming of children, murder and terrorism. Information about a suspect's whereabouts at particular times, their journeys and addresses of associates can all be discovered - if they have been using a GPS. The devices retain hundreds of records of locations and routes in their memory.

    So all you criminals out there, make sure you use GPS whenever possible. We all know your actions are a desperate cry for attention.

    [Thanks, Tim]

  • FlowingData Cited in Forbes Magazine?

    June 28, 2008  |  Data Sources

    Whaaa? Cool beans.

  • What Do People Want to Do With Their Lives?

    June 17, 2008  |  Data Sources, Projects, Visualization

    43things-viz

    43 Things is a goal-setting community where people set goals, cheer each other on, and connect with others who are trying to achieve the same thing. Even if you're not setting goals yourself, it's still interesting and often amusing to see what others have set out to do e.g. go skinny dipping, have a one night stand, and be myself.
    Continue Reading

  • Our Non-ability to Misunderstand Statistics of Rare Events

    June 4, 2008  |  Statistics

    The DiceCory Doctorow from The Guardian writes about our inability to understand the statistics of rare events. We obsess so much over the near-impossible probability that something could happen that it clouds our vision of more probable events.

    The rare - and the lurid - loom large in our imagination, and it's to our great detriment when it comes to our safety and security. As a new father, I'm understandably worried about the idea of my child falling victim to some nefarious predator Out There, waiting to break in and take my child away. There's a part of me who understands the panicked parent who rings 999 when he sees some street photographer aiming a lens at a kids' playground.

    But the fact is that attacks by strangers are so rare as to be practically nonexistent. If your child is assaulted, the perpetrator is almost certainly a relative (most likely a parent). If not a relative, then a close family friend. If not a close family friend, then a trusted authority figure.

    Says Doctorow, such misunderstanding is why we gamble in casinos and why we have to wait in long security lines at the airport. We see piles of money and terrorist attacks when ultimately, the chances that you'll win a jackpot or pass over violence is much less likely - near impossible - compared to losing all of your money and losing valuables to a curious luggage handler.

    If there's one thing the government and our educational institutions could do to keep us safer, it's this: teach us how statistics works.

    Amen to that.

    [Thanks, Jan]

Unless otherwise noted, graphics and words by me are licensed under Creative Commons BY-NC. Contact original authors for everything else.