Why I Do Not Swivel Data
I've been back and forth on whether or not I wanted to post about this. Two reasons: I feel blasphemous feeling this way; and I'm not sure if I'm working for or against my hopes for data awareness. I also think I might be getting some mild form of carpal tunnel. Ow.
I'm a graduate student in Statistics, and I don't like Swivel. Why? How is that even possible? All of my work encircles data, I blog about flowing data, and I read about data. So why can't I force myself to enjoy the "tasty data treats for data geeks" offered by Swivel?
People (Not) Part of the Team
There are smart people behind Swivel. That's clear. However, there seems to be some missing pieces.
The company's founders have backgrounds in physics (and apparently computer science), and there are three main advisers -- two of which, I'm guessing, are very computer science oriented while the third has business in mind.
Let's think about this for a second.
Swivel is a data store that houses lots of different types of data from various sources. Data is then visualized with some graph/plot/map thing and correlated with other data sets that already exist in the database.
I'm going to pause it right there.
There should really be a data/information visualization expert advising Swivel, because they are clearly lacking one from the looks of their graphs (which I will get to soon). Secondly, they're trying to do some serious correlations between multiple-type data sets. For this part in the data chain, Swivel needs to up their statistical expertise. Yes, we all took or will take that introduction to statistics course in undergrad; however, that's not enough. Reliability gets much more complicated for real-world data.
If there's already a stat and/or visualization expert on the job then please forgive my ignorance.
Visualization Needs Improvement
I think the type of visualization Swivel offers is an indication of who Swivel caters to and the type of data they plan to focus on. What do those visualizations remind you of? They remind me of Powerpoint, and when I think of Powerpoint, I think of boring talks and silly graphs. Why must people put background images on their graphs? The only purpose those pictures serve is distraction from what should be the focus -- the data.
There's also the interaction problem. Outside of scrolling over a bar or point to see the value, there isn't much else. Well, I take that back. You can click on spots that will take you to a data table, but in turn, you're taken away from the viz, and you're just back at where you started.
I'm Not Sure About the Statistical Validity
There are "correlations" for each graph, but what do they mean? I'm looking at the Hurricane Katrina graph, and I see that White and Black are 97% correlated. I'm sure they're plugging into some formula, but what assumptions are being made about the data? How do people use these numbers if they don't understand correlation?
Swivel also has a comparison feature to place one data set against another. Finally we can compare our sales data to the weather, Swivel tells me. I don't know about this one. Again, I'm not sure what they're doing or what assumptions are being made about the data. Honestly, I haven't been able to make any successful comparisons using the weird user interface.
Maybe I'm just dumb.
Needs Better Data Quality Control
I previously pointed to a Swivel graph that showed use of steroids and performance enhancing drugs. The only problem was that the data was for testing -- not use. That's a big difference, yes?
This wasn't just some random graph that I dug up in the depths of some deep, dark abyss, hidden away for no one to see. It was a featured graph on the Swivel homepage. Shouldn't someone be checking this stuff? Of course this was just one incident. I don't investigate every featured graph nor do I go sleuthing through the archives, but who knows what other weirdness is lurking in the dark.
It's nice to think that we can rely on users completely to decipher what data is good and what data is bad, but there should still be someone looking out for weird kinks. For example, in blogs, you'll find that the first few commenters for an entry can influence how future commenters will react. If we think of each graph as a blog entry... well, what if the discussion starts off on something that isn't there?
I can also see someone grabbing data from some site, mucking around with the data set in Excel and then uploading their finished result to Swivel. There are plenty of things that could go wrong there like a mislabeled column, typos, or something simple, but serious like changing null values to zeros.
There are books written about this subject -- statistical quality control -- that the Swivel people should probably read. Have they? Are they aware of it?
Swivel is a Business
Finally, at the end of the day, Swivel is a business. Minor Ventures didn't invest to not make money, and as two of the three main Swivel advisers are entrepreneurs, they're certainly going to move in the direction that leads them to money. I'm not saying that all Swivelers care about is cold, hard cash, because I'm quite sure (sort of) that there's an interest in data. What I'm saying is that certain motivations are going to push development in certain directions.
With that in mind, something doesn't feel quite right. Even though I know my data will always be free, it's possible that users with private accounts are using my data for something, but at the same time I don't get to see what they're doing or what data they have. I guess I'm selfish.
Okay, there I said it. I'm glad I got that off of my chest, but at the same time I feel wrong and dirty for saying it. I am sorry. I want to like Swivel. I really do. However, until some changes are made, I'm going to have to pass.