Gimme my data

We intentionally and unintentionally put data in places like Facebook and Google but most of us don’t think much of it. In an interview with The Guardian, Tim Berners-Lee, inventor of the Web, says why you should care.

My computer has a great understanding of my state of fitness, of the things I’m eating, of the places I’m at. My phone understands from being in my pocket how much exercise I’ve been getting and how many stairs I’ve been walking up and so on.”

Exploiting such data could provide hugely useful services to individuals, he said, but only if their computers had access to personal data held about them by web companies. “One of the issues of social networking silos is that they have the data and I don’t … There are no programmes that I can run on my computer which allow me to use all the data in each of the social networking systems that I use plus all the data in my calendar plus in my running map site, plus the data in my little fitness gadget and so on to really provide an excellent support to me.

Of course, getting users to see that is easier said than done. And until users see, the incentive for companies to provide such a service is low. In turn, it’s hard for data people to make a case to users, and you end up with a lot of hand waving. Challenge accepted?


  • Do you think things like openpaths begin to fill this space?

  • Good luck getting the average user to understand the need for breaking data out of its various silos. I have yet to work for a company that recognizes that need, and their profit margin usually depends on gathering and understanding that fragmented data. I know more businesses are catching one, but we’re way behind that curve outside the tech industry.

  • I don’t think it’s so much of a user issue as it is an opportunity for start-ups. Using APIs, you can already get all that information (or most of it anyway… not sure about the “little fitness app” he mentions). But the point is, a company could certainly build an app that ties in all of those things and makes it useful. It’s just a matter of doing. I think most startups think in terms of a “one hit” meaning keep it small and elegant. It’s tough to sell a larger project like this that relies on so many data feeds that may shift and change and require a lot of maintenance not to mention the AI or machine learning required for making sense of your FB posts and how they relate to your fitness or financial goals. It’s a lot of work.

  • In fact, it is not that inaccessible for data people (whatever it means). There is a bit of tooling to do though. We interact a *lot* through the browser. This browser is a data broker. Now imagine that you create a proxy service, which records all your HTTP transactions (POST and GET). You can start to create a series of logging addons putting the important, interesting bits into a DB. For sure it will break each time, their APIs, their markup change, etc. But it is already a possibility.

    For the APIs part that Kim is expressing, the important is to keep copy of all these data locally. So the future might break, but the past is kept.

  • It would be interesting to see someone put together some recipes on that could better integrate data across accounts since they’ve already done quite a bit of work with each applications API.