How Open Should Open Source Data Visualization Be?
I used to ride my bike to school, and I always forgot my U-lock. Instead of riding back for it, I'd just stash my bike unlocked in between a cluster of bikes. I told my friend jokingly, "It'll be OK. 98% of people are good." One day I got out of class, and my bike was stolen.
I was cleaning up some Actionscript in preparation for a tutorial post on how to make your own animated Walmart map, but a couple of bad memories involving stolen code and bad knockoffs (of my work) stopped me midway. I had to think:
Is releasing my code the best thing to do?
I'm sure the consensus is a resounding yes, but what's to stop some lazy person from ripping off my code and pawning it off (or worse, selling it) as their own? What if I want to sell my visualizations? I am after all a lowly graduate student. It'd be nice to have another income stream.
On the other hand, had others before me not released their work under that wonderful BSD license, I would not be able to do what I do. At least not as easily. Modest Maps? Free. TweenFilterLite? Free. Flare Visualization Toolkit? Free. If I don't follow suit, does that make me selfish? Yes, it does.
Giving Back to the Community
I've heard that phrase, giving back, so many times in both the real-life sense and the digital one, but it never made much sense to me. I mean, I got it, but I never really got it.
Perhaps I never understood it, because I wasn't using much of the community's resources nor did I have anything to give back. I have something to give back now. I can help people learn in the same way that others before me have and still do. I'm incredibly thankful to those who maintain these open source projects and still help me out from time to time when there's really nothing in it for them.
The least I can do is continue to promote this idea of openness and help this small field of data visualization flourish into what it deserves to be. It's why I blog, and it's why I should give back, but to what extent?
Making the Case for Open Source Data Visualization
My dilemma brought me back to a Data Evolution post on open source data visualization. It highlighted three things:
- Open Tools â€“ As in freely available software tools like R and Processing.
- Open Code â€“ How often have you seen a visualization and wondered, "How did they do that? If only the code were available."
- Open Data â€“ Oh so important in data visualization. The core. Open data means more people can try out different methods.
It's not always possible to attain all three. For example, we pay money for software because the companies would not exist otherwise. It's a business, and to think that software companies would develop a bunch of free software is unrealistic. Also, oftentimes, data just can't be shared â€“ usually because of privacy issues. Lastly, open code doesn't make sense a lot of the time. The DE post grades The New York Times with a D for openness, but they're a news business, not a visualization repository.
While we can't always attain all of three things, there's no reason why we can't try to strive towards that ideal. As someone I know likes to say â€“ strive for perfection. You might not reach that standard, but you could end up with something close.
Open source is a development method for software that harnesses the power of distributed peer review and transparency of process. The promise of open source is better quality, higher reliability, more flexibility, lower cost, and an end to predatory vendor lock-in.
@rpj: To release, always! (When legally possible.)
@ehrenc: re: code. You could always release half the code :)
@pims to release code. There's some brilliant people around that can build on top of what you did. Open world :)
As for me, well, let's just say you should expect to see tutorials â€“ complete with code â€“ in the coming weeks.