Data hackathon challenges and why questions are important

March 12, 2013  |  Statistics

Jake Porway, executive director of DataKind on data hackathons and why they require careful planning to actually work:

Any data scientist worth their salary will tell you that you should start with a question, NOT the data. Unfortunately, data hackathons often lack clear problem definitions. Most companies think that if you can just get hackers, pizza, and data together in a room, magic will happen. This is the same as if Habitat for Humanity gathered its volunteers around a pile of wood and said, "Have at it!" By the end of the day you'd be left with a half of a sunroom with 14 outlets in it.

Without subject matter experts available to articulate problems in advance, you get results like those from the Reinvent Green Hackathon. Reinvent Green was a city initiative in NYC aimed at having technologists improve sustainability in New York. Winners of this hackathon included an app to help cyclists "bikepool" together and a farmer's market inventory app. These apps are great on their own, but they don't solve the city's sustainability problems. They solve the participants' problems because as a young affluent hacker, my problem isn't improving the city's recycling programs, it's finding kale on Saturdays.

Without clear direction on what to do with the data or questions worth answering, hackathons can end up being a bust from all angles. From the organizer side, you end up with a hodgepodge of projects that vary a lot in quality and purpose. From the participant side, you're left up to your own devices and have to approach the data blind, without a clear starting point. From the judging side, you almost always end up having to pick a winner when there isn't a clear one, because the criteria of the contest was fuzzy to begin with.

This also applies to hiring freelancers for visualization work. You should have a clear goal or story to tell with your data. If you expect the hire to analyze your data and produce a graphic, you better get someone with a statistics background. Otherwise, you end up with a design-heavy piece with little substance.

Basically, the more specific you can be about what you're looking for, the better.

4 Comments

  • Good post. Definitely felt a bit disoriented and directionless at a hackathon before. Better if the goal is a question they want to have answered with the data than a story they want to tell with it, though. The data may not support the plot they have in mind.

  • Isn’t that the point of hackathons? To create a playground to come together and have fun while experimenting and creating stuff? (and maybe develop something bigger based on that, later on.) I believe many organizers actually see it that way, and don’t expect much more. If you as organizer or data provider want to have actual solutions to your specific problems it might probably be a better idea to hire people instead of giving out some pizzas for free. So, while some clients certainly need to be educated (as Jake is suggesting), many hackathons are reasonable and useful.

    • I think that was the original intention of hackathons, but most of the hackathons I see these days are sponsored by groups who have data and want certain problems solved, but haven’t quite pinpointed what those problems are. So they end up kind of disappointed.

  • Great post. I agree with Till. For many, the whole point of some hackathons is to play around with no set goal other than to get your hands dirty with some data and see what you can build.

    I also think that there is room for hackathons with different intentions, like to come up with creative and clever ways to solve specific problems. This type of hackathon is common with open source software communities that get together to work on specific features or other issues. The Sunlight Foundation has also used this model to invite developers to help write parsers to extract data from financial reports.

    I think the point is though that if the intention is to have an impact on some of the issues outside of the computer lab, like helping those in need, or helping cities solve some of their challenges, then it’s essential to talk to the folks that know something about the problem being solved.

Unless otherwise noted, graphics and words by me are licensed under Creative Commons BY-NC. Contact original authors for everything else.