Category: Data Sharing

  • Tim Berners-Lee with an update on open data

    Posted Mar 15, 2010 to Data Sharing / 1 comment

    If people put data on the Web - government data, scientific data, community data - whatever it is, it will be used by other people to do wonderful things in ways they never could have imagined.
    Tim Berners-Lee, TED, February 2010

    Tim Berners-Lee, credited with inventing the World Wide Web, comes back to TED a year after his call for open, structured data with a quick update. Spoiler alert: things are looking good - and they're only going to get a lot better. But you already knew that, right?

    [via infosthetics]

  • Open Thread: Is Google Latitude Dangerous?

    Google recently released Google Latitude, which is an online application that lets you share your location with online friends:

    Of course when any application shares where you are at any given time, people start to feel like Big Brother is looming in the background ready to sneak up on us from behind a giant bush. Some call it a real danger, but is it really? I put this question out to all of you:

    Is Google Latitude a danger to anyone who uses it?

    My take on things is that people are already doing it anyways, so why not make it easier for those who are interested? Sure, if some stalker got a hold of your location, that could be bad, but that's true for a lot of data... credit card statements, cell phone logs, Twitter... As long as the proper security are put in place, I don't see what all the fuss is about.

  • Walker Tracker – A Community Site for Pedometer Fans

    Posted Jan 23, 2008 to Data Sharing / 1 comment

    Walker Tracker – A Community Site for Pedometer Fans

    Those of you who have been around since the beginning know that I am just obsessed with my pedometer. Albeit, lately, I haven't felt inclined to go for a winter stroll in the below freezing weather. When I was keeping track of my steps though, one of the difficulties was staying consistent. Sometimes I would forget to wear my pedometer, while other times I would forget to record my steps.

    I imagine Walker Tracker could help a bit in solving that second problem. I know it was always easier to make it to the gym when I knew one of my friends was going to meet me there. Walker Tracker is like that friend at the gym. The site lets you keep track of your steps as well as see how others are doing.

    We're trying to change the world. We're trying to get you and us and everyone we know off the elevator and out of the car and onto the sidewalks and trails. We're doing it one step at a time.
    Get up, stand up and walk.

    OK, maybe it's a little hoorah, but if you feel like actually accomplishing a new year's resolution this year, Walker Tracker could be a good place to start.

    [via Web Worker Daily]

  • 100 Reasons You Should Be Interested in, Want to Share, and Get Excited About Data

    Posted Nov 7, 2007 to Data Sharing / 2 comments

    When I talk about data, people often zone out or don't really see the interest. Why does this happen? People just don't understand the wonder that is data and how much of their life is led by data. With that in mind, why would people share their data? You can't share something you don't know exists. Off the top of my head, here's 100 reasons to be interested in, want to share, and get excited about data.

    1. Be completely transparent to build trust
    2. It won't seem like you're hiding something
    3. Understand impact on the environment
    4. Get opinions from other people/experts
    5. Increase awareness of neighborhood
    6. Truth in numbers
    7. Provide better examples to argue a point
    8. Wisdom of crowds
    9. Pretty pictures
    10. Beautiful dynamic data visualization
    11. Proof in the data
    12. Understanding of the world
    13. Understanding of yourself
    14. Understanding of your neighborhood
    15. Understanding of your city
    16. Understanding of your county
    17. Understanding of your state
    18. Earn the one million dollar Netflix prize
    19. Appreciate sports on a different level
    20. Data is cool
    21. Save money on utilities
    22. Data-driven art
    23. Overcoming unwarranted biases
    24. Avoid jumping to conclusions
    25. Understand confusing politics
    26. Make educated election votes
    27. Enjoy a new way of programming
    28. Find and see trends over time
    29. Find and see themes over geographical regions
    30. Watch changes over space and time
    31. Explore relationships between network nodes
    32. Know what crime-ridden areas to stay away from
    33. Check up on proper news reporting
    34. Watchdog on scientific research results
    35. Work for a cool newspaper like the New York Times
    36. Improve network protocols
    37. Optimize traffic flows
    38. Minimize the amount you spend on flights
    39. Find ideal products based on what you've already purchased
    40. User-specific book recommendations
    41. User-specific movie recommendations
    42. Image and vision sciences
    43. Statistical computing
    44. Produce real research results
    45. Find drugs that actually help and don't harm
    46. Deciphering genetic code
    47. Cryptography
    48. Understand dorky math dramas
    49. Spam protection
    50. Improve business
    51. Earn big money in Black Jack
    52. Increase sales
    53. Earn more money from advertising
    54. Gain an appreciation of numbers
    55. Online and public databases
    56. Lose weight
    57. Gain weight
    58. Improve workout routine
    59. Industrial engineering
    60. Understand government policy
    61. Get an 'A' in statistics
    62. Data visualization is gaining momentum
    63. Amount of data is growing constantly
    64. Make a more tasty wine
    65. Find out the public opinion
    66. Save money while surveying the public
    67. Win in Yahtzee
    68. Game theory
    69. Law of Large Numbers
    70. Central Limit Theorem
    71. Weather forecasting
    72. Financial forecasting
    73. Know what stocks to invest in
    74. Figure out where to put your extra cash
    75. Market research
    76. Learn to release profitable movies
    77. Accountability
    78. Accounting
    79. See from a different angle
    80. Identify the best in a large group
    81. Identify the worst in a large group
    82. Discover who is cheating on tests
    83. Research why crime is on the rise
    84. Make your arguments more credible
    85. Identify who is making up results
    86. Sharing is caring
    87. Data from many often provides more than data from one
    88. Natural language processing
    89. Face identification in a crowd
    90. Save the whales
    91. Improve computer performance
    92. Find cures for diseases
    93. Appreciate cool Stamen Design projects
    94. Optimize crop growth
    95. Compare and contrast profiles
    96. Enroll in a great statistics graduate program like UCLA
    97. Detect major changes in climate
    98. Detect small changes in micro-climates
    99. Data is fun

    What did I miss?

  • Access Restrictions on the Release of Gun Sales Data

    Posted Oct 24, 2007 to Data Sharing / 3 comments

    I just found this in my draft folder from a while back. It's kind of old news, but I think it's still worth mentioning.

    Gun control advocates failed to gain local government and law enforcement agencies' access to gun sales data.

    The House Appropriations Committee defeated two attempts by gun control advocates to strip four-year-old restrictions on the use of information from Bureau of Alcohol, Tobacco, Firearms and Explosives tracing gun sales. The votes were a victory for the National Rifle Association and came despite the Democratic takeover of Congress in January.

    One side argues that gun sales data will help law enforcement agencies track gun dealers who sell guns illegally. The other side argues that there's privacy at stake, and there's a chance that police officers' identities could be inferred. A big victory for gun rights advocates, or so the the article might suggest.

    My opinion -- even if gun sales data were given to law enforcement, how could anyone guarantee data integrity? I think it's fair to say that dealers selling guns illegally aren't going to provide accurate reports. Sell a gun under the table with cash, don't report it, and the data doesn't reveal much. Am I missing something here?

  • Second Day of New York Taxi Strikes

    Posted Sep 6, 2007 to Data Sharing / Add your comment

    Cabspotting

    As the second day of the New York taxi strike begins over GPS and credit card technology, I'm left wondering what taxi drivers are making such a big fuss over. First, drivers are complaining that GPS is an invasion of privacy, and second, they argue that credit card transactions will cause a decrease in profits due to credit card fees.

    Starting with the credit card transactions, I'm about 80% sure that drivers don't have any actual data to back up their claims that they're going to start making less money. Non-strikers say that the credit card capability will not only help business (by bringing in those with corporate credit cards), but also increase tips. This information comes from cabs that are already equipped with the proper gizmos.

    What are taxi drivers trying to hide? What is this invasion of privacy talk? These drivers are working for a large company. I repeat, they're working. I don't demand a private office when I'm at work, and I don't see much reason drivers should care a whole lot. If someone is slacking, taking shady routes, or just plain doing something they're not supposed to do, then they should be held accountable. Unless I'm mistaken, I don't recall a whole lot of whining when San Francisco cabs had similar equipment installed.

    So stop the fuss, and just mondernize up to the proper century, New York cab drivers. I'm sure Stamen Design and Cabspotting* would greatly appreciate it.

    *I am not associated to either.

  • Don’t want to share our data / OK, what’re you hiding?

    Posted Aug 20, 2007 to Data Sharing / Add your comment

    I don't want my credit card numbers floating around, because then I'd be screwed. That kind of data needs to be locked up tight behind a billion firewalls, a lock safe, five armed guards, and another locked safe and then one more guard plus another safe. However, there are lots of other kinds of data that should be online and publicly available or at least accessible via a phone call.

    As a student, I've always received data from the prof or from some magical place called data land. It's not that easy in the real world, and as an intern, I'm beginning to see a trend -- if you're not willing to give me your data or some tiny subset of your data, then you're probably hiding something.

    I recently did a whole lot of back and forth for two weeks trying to get some data from a group that will go unnamed. Without getting into too many details, I wanted data that showed the group's progress -- what they've accomplished over X number of months. You should probably also know that this group has taken a lot of heat lately for their slow pace and shotty labor.

    Here's how it went.

    Day 1-3

    "Nathan, can you contact so and so and ask them for this and this data or see what they have?" Sure, no problem. I emailed the reporter's contact, who happened to be a contractor for the big group I was trying to get data from. We exchanged some emails, and it turned out that the contractor was working with the data that was exactly what we wanted. Um, gimme.

    Day 4-8

    The contractor had to get approval from the "chief of staff." Unfortunately the chief of staff was out for the week, so he had to go through some other people. Contractor gets distracted, and I get forwarded to public affairs. "Oh great, this will be fun," I thought. Of course, this is when it got especially painful. After some misunderstandings and 11 emails later, it was back to the contractor. Same old story. Need approval, yada yada. Keep in mind that during all of this, my co-worker is putting together a graphic.

    Day 9-13

    It was just all waiting now. They had the data and were waiting to get the sign off. I called one or two times a day and sent an email to both the contractor and public affairs guy once a day. There was lots of fluffy, meaningless talk during this phase.

    Day 14

    At the end of Day 14, I got the phone call. "Nathan, we have some data that we're ready to send your way. Your patience has been rewarded." I can't believe he actually said that. My patience had been rewarded with nothing. Too bad the graphic was already entering its final editing stages without their data.

    The data wasn't really worth the effort.

    Hence, the Difficulty

    So here we stand with this great idea of sharing data. So wonderful and marvelous, we can't even fathom how we can benefit. However, data can be very revealing, and there are many groups, people, and organizations who aren't ready to show what they have. Either they're afraid of sharing data for security reasons (which is understandable), or they're afraid because they're worried about what they're handing over. In both cases, it's a huge blockade that I don't see us getting through any time soon.

  • My Mission is to Collect Basic Data

    Posted Aug 13, 2007 to Data Sharing / 3 comments

    PedometerI began my path of higher education at Berkeley as an Electrical Engineering and Computer Science student. As a stat graduate student, it's hard to remember sitting in all of those (boring) engineering classes.

    If I learned anything though, it was from the painful computer science projects. No matter how big the project, I would start by breaking it up into lots of mini-tasks and work my way up to the final solution. I think this has helped me a lot not only in grad school, but solving problems in my life. Hence, my first attempt at continuous data collection has started at a very basic level -- my pedometer.

    Continue Reading

  • Immigration Data Available from Homeland Security

    Posted Jul 5, 2007 to Data Sharing / Add your comment

    There was a Sharp Rise Seen in Applications for Citizenship, as reported in The Times today, and of course there was a graphic to complement that article that showed the rise in applications over the years as well as a by-country breakdown for 2006.

    Surge Seen in Applications for Citizenship

    Graphics in The Times always site the source, which was Department of Homeland Security in this case. I thought, "Do they have some kind of source who they actually call to get this data?" Thinking such a thing, I feel pretty dumb now. In fact, I always see that source on all of the graphics, and have just assumed that there was some connection between The Times and the source.

    Wrong.

    So lazy me finally decided to look into things, and you know what, the Department of Homeland Security has a whole section on their website for Immigration Statistics. There are freely available spreadsheets, reports, publications, and even a little something on data standards and definitions, prepared by none other than the -- Office of Immigration Statistics. Very pleased.

    It's kind of sad that this is just now news to me, but better now than never, eh?

  • CitiStat: Injured on Duty “Data”

    Posted Jul 2, 2007 to Data Sharing / Add your comment

    CitiStat Buffalo

    I was flipping through the channels the other night and came across a televised CitiStat meeting for June 1. A bit of a coincidence since I happened to be looking at the CitiStat website earlier that day. What's CitiStat, you ask? Well it's like a spin-off of CompStat, a program in NYC and LA, that makes police officials accountable for their actions by looking at data -- number of homicides, where they happened, what's being done, etc. CitiStat, in Buffalo, is the same thing, but for the Police, Fire Department, and whatever else they can think of, and seemingly not quite as reputable.

    Anyways, they were talking to some city official about fire department employees that were IOD, um, that's injured on duty (but I must've heard IOD like a billion times). There was some discrepancy on the definition of IOD. As a result, the data was worthless. The police commissioner spoke as well with his own IOD numbers. After that, there was a lot of arguing and as a result, a meeting was agreed upon. Well, not really. They agreed that they would schedule some meeting, but it's been a year of "What is an IOD?" Pretty sure that won't be settled for a while.

    They were also able to agree that the number of IODs was somewhere between 50 and 200. Yay.

    So despite the fact that the CitiStat program is two years old, there's still lots to be done. Officials aren't used to recording and looking at data, and it's clear, few even had any notion that data could be useful. However, I am glad that they're making the effort -- even if all of the data is stored on a bunch of inconsistent Excel spreadsheets :P.