• Map your location – that your iPhone secretly records

    April 20, 2011  |  Data Sources, Mapping

    iphone gps trace

    Researchers Alasdair Allan and Pete Warden have found that the iPhone records cell tower access, and hence your location, in an easy-to-read file that is transferred as you switch devices. And they do this whether you like it or not.

    The more fundamental problem is that Apple are collecting this information at all. Cell-phone providers collect similar data almost inevitably as part of their operations, but it’s kept behind their firewall. It normally requires a court order to gain access to it, whereas this is available to anyone who can get their hands on your phone or computer.

    Allan and Warden provide an open-source application, iPhone Tracker, that maps that data. The good news is that the data doesn't seem go to be anywhere other than your own backups and devices. Privacy concerns aside, this kind of makes me wish I had an iPhone; although I suspect my map would be painfully boring.

    [iPhone Tracker via Marco]

  • Data.gov and other transparency sites to be shut down due to budget cuts

    March 31, 2011  |  Data Sources

    Last week, there were rumblings over the end of the Statistical Abstract, and I suggested that it was just a sign of changing technologies. I thought that Data.gov and similar sites were the natural progression. Here's the problem with that argument. Congress is planning on shutting down Data.gov and other transparency sites in the next few months.
    Continue Reading

  • Tell-all telephone reveals politician’s life

    March 30, 2011  |  Data Sources, Mapping

    Tell-all telephone

    Not many people understand the importance of data privacy. They don't get out how little bits of information sent from your phone every now and then can show a lot about your day-to-day life.

    As the German government tries to come to a consensus about its data retention rules, Green party politician Malte Spitz retrieved six months of phone data from Deutsche Telekom (by suing them), to show what you can get from a little bit of private mobile data. He handed the data to Zeit Online, and they in turn mapped and animated practically every one of Spitz' moves over half a year and combined it with publicly available information from sources such as his appointment website, blog, and Twitter feed for more context.
    Continue Reading

  • Lots of health data released via Health Indicators Warehouse

    March 1, 2011  |  Data Sources

    Health indicators warehouse

    The government has been making a big push for more open health-related data, and a couple of weeks ago, they released a whole bunch of it with the launch of HealthData.gov. It's the same interface as Data.gov, but for health. Additionally, the Health Indicators Warehouse launched with different data and a slightly more useable interface.

    A quick scan of the data available, however, does seem to indicate that a lot of it is spotty or outdated (like on data.gov), which doesn't make it especially useful. For example, some data sets are only one data point, while others are only a single year. At least it's a start.

    [Health Indicators Warehouse via @periscopic]

  • Million song dataset available for download

    February 24, 2011  |  Data Sources

    Need music data? Get all the data you want and more from the freely available million song dataset, offered by LabROSA at Columbia University and Echo Nest. There's lots of metadata on song features and your standard stuff like year and artist. There are also several code wrappers and samples to help researchers make use of the data right away.

    [Million Song Dataset via @MacDivaONA]

  • Sunlight Labs opens up Real Time Congress API

    February 17, 2011  |  Data Sources

    Sunlight Labs continues its work for a more open government with its recent release of the Real Time Congress API.

    Today we're making available the Real Time Congress API, a service we've been working on for several months, and will be continuing to expand.

    The Real Time Congress API (RTC) is a RESTful API over the artifacts of Congress, kept up to date in as close to real time as possible. It consists of several live feeds of data, available in JSON or XML. These feeds are filterable and sortable and sliceable in all sorts of different ways, and you can read the docs to see how.

    There are seven data types the API will report:

    • Bills
    • Votes
    • Amendments
    • Videos
    • Floor Updates
    • Committee Hearings
    • Documents

    Now someone has to do something with all of this data coming in. Can you think of a useful application for what is essentially an automated government Twitter feed?

    [Real Time Congress API]

  • Find more of the data you need with DataMarket

    January 31, 2011  |  Data Sources, Online Applications

    Add another online destination to find the data that you need. DataMarket launched back in May with Icelandic data, but just a few days ago relaunched with data of the international variety. They tout 100 million time series datasets and 600 million facts. I'm not totally sure what that means (100 million lines, sets of lines?), but I take it that means a lot.

    Just over 2 years and countless cups of coffee after we started coding, DataMarket.com launches with international data. You can now find, visualize and download data from many of the world’s most important data providers on our site.

    At first glance DataMarket feels a lot like now defunct Swivel. Search for the data you want and you get back a list of datasets. The focus on only time series though is actually a plus in that they can provide more specific tools to visualize and explore. The current toolset isn't going to blow you away, but it's not bad.
    Continue Reading

  • A guide for scraping data

    January 17, 2011  |  Data Sources

    Data is rarely in the format you want it. Dan Nguyen, for ProPublica, provides a thorough guide on how to scrape data from Flash, HTML, and PDF. [via @JanWillemTulp]

  • Jon Stewart explains Wikileaks’ Cablegate

    December 2, 2010  |  Data Sources, News

    You've probably already heard and read about Wikileaks' Cablegate. If not, Andy Baio has a fine roundup with significant coverage and events to get you caught up quick. Alternatively, you can watch Jon Stewart and The Daily Show explain in the clip below (slightly NSFW, because it mentions a body part).
    Continue Reading

  • How do people use Firefox?

    November 30, 2010  |  Data Sources, News

    Mozilla Labs just released a bunch of anonymized browsing data for their open data visualization competition:

    This competition is based on Mozilla's own open data program, Test Pilot. Test Pilot is a user research platform that collects structured user data through Firefox. All data is gathered through pre-defined Test Pilot studies, which aim to explore how people use their web browser and the Internet.

    There are two datasets in various formats. The first is browsing behavior from 27,000 users, including on/off private browsing that we saw a few months ago. The second dataset is from 160,000 users and is on how they actually use the Firefox interface.

    Additionally, both sets have survey answers to questions like "How long have you used Firefox?" which could make for some fun and interesting breakdowns.

    The deadline is December 17.

    [Mozilla Labs]

  • Recalls for March

    Making recalls and market withdrawals more accessible

    Last week I found out that the FDA has a feed for all product recalls and market withdrawals since 2009 and an RSS feed with…
  • Opportunities in Government 2.0

    October 27, 2010  |  Data Sources

    Vivek Wadhwa talks government data and the (financial) opportunities ripe for the picking:

    What is happening with the opening up of government data is nothing less than a silent revolution. There are literally thousands of new opportunities to improve government and to improve society—and to make a fortune while doing it. Unlike the Web 2.0 space, which is overcrowded, Gov 2.0 is uncharted territory: a new frontier to explore, grow things on, and settle on. It’s fresh soil for unlikely seedling ideas that, if they take root, could lead to very successful ventures. So I encourage entrepreneurs to stake their claims as soon as they can.

    Wait a minute. Hold up. You can do more with government data than awkward dashboards? Bring it.

    [TechCrunch via @ucdatalab]

  • How people use private browsing

    August 25, 2010  |  Data Sources, Statistics

    Time of day people use private browsing

    Private browsing. All the modern browsers have it. Turn it on, and the browser won't keep your history during the session. Sometimes it's used to pay bank bills on a public computer. Sometimes it's used for other stuff. In an opt-in study looking at a week in the life of a browser, Mozilla looked at how people use private browsing.

    Again, it's worth noting that people opted in to this study (about 4,000 of them), and Mozilla only recorded when users started and stopped private browsing. Nothing in between.

    That said, they came up with two basic findings. The first is when people typically use private browsing (above).

    They saw usage spikes during the lunch hours as well as just before the work day ended. The other spike is after the dinner hours and then finally, in the late hours of the night.
    Continue Reading

  • How weather data became open data

    August 18, 2010  |  Data Sources

    Weather in the private sector is over a $1.5 billion industry, and it's largely because of the government's open weather data. You can find what the weather is just about anywhere with just a few clicks of the mouse. It wasn't always like that though. Clay Johnson, former director of Sunlight Labs, describes the history of open weather data, starting with Thomas Jefferson in the late 1700s.
    Continue Reading

  • Afghanistan war logs revealed and mapped

    July 27, 2010  |  Data Sources, Mapping

    Afghanistan incidents from war logs

    This past Sunday, well-known whistle-blower site Wikileaks released over 91,000 secret US military reports, covering the war in Afghanistan. Each report contains the time, geographic location, and details of an event the US military thought was important enough to put on paper.
    Continue Reading

  • Data and its impact on journalism

    June 7, 2010  |  Data Sources, Statistics

    In regards to the UK's recent boom in open data, Simon Rogers of the Guardian, ponders data's role in journalism, and the opportunities this new found information could bring:

    The impact on journalism is expected to be great. The Chicago-based web developer and founder of the neighbourhood news site EveryBlock, Adrian Holovaty, says it's going to be challenging but exciting for journalists. "As more governments open their data, journalists lose privileged status as gatekeepers of information – but the need for their work as curators and explainers increases. The more data that's available in the world, the more essential it is for somebody to make sense of it."

    This need not only creates a fresh brand of news, but also a new type of journalist:

    I once prided myself on my lack of maths knowledge. Now I find myself editing a datajournalism site, the Guardian's datablog: a site where we use Google Spreadsheets to post key datasets. We make the data properly accessible, then encourage our users to take the numbers, produce graphics and applications and help us look for stories.

    Priding yourself on a lack of know-how on how to deal with data is a little weird, but okay.

    In any case, people always ask me how to get into information design, infographics, visualization etc. Journalism is one of those choices, and there's a lot of opportunity there if you've got the skills.

  • Egregious Citations Issued to BP

    June 6, 2010  |  Data Sources

    BP processes about 1.5 million barrels of crude oil per day, across six refineries in the United States. In total, 150 refineries in the United States process just under 18 million barrels per day, so BP processes about 8.5 percent of it. However, as reported by the Center for Public Integrity, 97 percent of the most dangerous violations found by OSHA were on BP properties.
    Continue Reading

  • Live webcast: Community Health Data Initiative

    June 2, 2010  |  Data Sources, News

    Health and Human Services (HHS) is about to announce the launch of their Community Health Data Initiative over in DC right now. The point is to make health data more usable for consumers and communities.

    Today groups will be presenting how they've made use of the data in the past few weeks from about 9:30 to 10:30 - as in right now. I've embedded the live webcast below.

    They're just going through the formalities of thank yous and intros right now, but the good stuff should start soon.
    Continue Reading

  • Twitter data buffet is back in business

    April 28, 2010  |  Data Sources

    Almost a year and a half ago, Infochimps, the data repository slash marketplace, released a giant scrape of Twitter data representing 2.7 million users, 10 million tweets, and 58 million connections. Twitter soon requested that they take it down while they figured out how they wanted to handle licensing, privacy, etc.

    That was in 2008, before Twitter really started booming. Fast forward to now. Twitter and Infochimps have figured out what they want to do, and the Twitter census data is back up. It's no longer a measly 2.7 million users anymore though. The population has grown to 35 million.
    Continue Reading

  • World data released ‘is a dream come true’

    April 20, 2010  |  Data Sources

    In another step towards open data and all that jazz, the World Bank released World Development Indicators 2010 today, which is meant to serve as a progress report of the world.

    The WDI provides a valuable statistical picture of the world and how far we've come in advancing development," said Justin Yifu Lin, the World Bank’s Chief Economist and the Senior Vice President for Development Economics. “Making this comprehensive data free for all is a dream come true.

    More importantly though, this comes with the launch of the freely available online database and public API to 1,000+ indicators. There used to be a big fee for this data. I can't speak for the API, but the website is well-designed. It has profile pages for each country, links to download the indicators in Excel and XML, and hey, are those graphs implemented in HTML5? I spy <canvas> tags.
    Continue Reading

Copyright © 2007-2014 FlowingData. All rights reserved. Hosted by Linode.