Why I Do Not Swivel Data

I’ve been back and forth on whether or not I wanted to post about this. Two reasons: I feel blasphemous feeling this way; and I’m not sure if I’m working for or against my hopes for data awareness. I also think I might be getting some mild form of carpal tunnel. Ow.

I’m a graduate student in Statistics, and I don’t like Swivel. Why? How is that even possible? All of my work encircles data, I blog about flowing data, and I read about data. So why can’t I force myself to enjoy the “tasty data treats for data geeks” offered by Swivel?

People (Not) Part of the Team

There are smart people behind Swivel. That’s clear. However, there seems to be some missing pieces.

The company’s founders have backgrounds in physics (and apparently computer science), and there are three main advisers — two of which, I’m guessing, are very computer science oriented while the third has business in mind.

Let’s think about this for a second.

Swivel is a data store that houses lots of different types of data from various sources. Data is then visualized with some graph/plot/map thing and correlated with other data sets that already exist in the database.

I’m going to pause it right there.

There should really be a data/information visualization expert advising Swivel, because they are clearly lacking one from the looks of their graphs (which I will get to soon). Secondly, they’re trying to do some serious correlations between multiple-type data sets. For this part in the data chain, Swivel needs to up their statistical expertise. Yes, we all took or will take that introduction to statistics course in undergrad; however, that’s not enough. Reliability gets much more complicated for real-world data.

If there’s already a stat and/or visualization expert on the job then please forgive my ignorance.

Visualization Needs Improvement

I think the type of visualization Swivel offers is an indication of who Swivel caters to and the type of data they plan to focus on. What do those visualizations remind you of? They remind me of Powerpoint, and when I think of Powerpoint, I think of boring talks and silly graphs. Why must people put background images on their graphs? The only purpose those pictures serve is distraction from what should be the focus — the data.

There’s also the interaction problem. Outside of scrolling over a bar or point to see the value, there isn’t much else. Well, I take that back. You can click on spots that will take you to a data table, but in turn, you’re taken away from the viz, and you’re just back at where you started.

I’m Not Sure About the Statistical Validity

There are “correlations” for each graph, but what do they mean? I’m looking at the Hurricane Katrina graph, and I see that White and Black are 97% correlated. I’m sure they’re plugging into some formula, but what assumptions are being made about the data? How do people use these numbers if they don’t understand correlation?

Swivel also has a comparison feature to place one data set against another. Finally we can compare our sales data to the weather, Swivel tells me. I don’t know about this one. Again, I’m not sure what they’re doing or what assumptions are being made about the data. Honestly, I haven’t been able to make any successful comparisons using the weird user interface.

Maybe I’m just dumb.

Needs Better Data Quality Control

I previously pointed to a Swivel graph that showed use of steroids and performance enhancing drugs. The only problem was that the data was for testing — not use. That’s a big difference, yes?

This wasn’t just some random graph that I dug up in the depths of some deep, dark abyss, hidden away for no one to see. It was a featured graph on the Swivel homepage. Shouldn’t someone be checking this stuff? Of course this was just one incident. I don’t investigate every featured graph nor do I go sleuthing through the archives, but who knows what other weirdness is lurking in the dark.

It’s nice to think that we can rely on users completely to decipher what data is good and what data is bad, but there should still be someone looking out for weird kinks. For example, in blogs, you’ll find that the first few commenters for an entry can influence how future commenters will react. If we think of each graph as a blog entry… well, what if the discussion starts off on something that isn’t there?

I can also see someone grabbing data from some site, mucking around with the data set in Excel and then uploading their finished result to Swivel. There are plenty of things that could go wrong there like a mislabeled column, typos, or something simple, but serious like changing null values to zeros.

There are books written about this subject — statistical quality control — that the Swivel people should probably read. Have they? Are they aware of it?

Swivel is a Business

Finally, at the end of the day, Swivel is a business. Minor Ventures didn’t invest to not make money, and as two of the three main Swivel advisers are entrepreneurs, they’re certainly going to move in the direction that leads them to money. I’m not saying that all Swivelers care about is cold, hard cash, because I’m quite sure (sort of) that there’s an interest in data. What I’m saying is that certain motivations are going to push development in certain directions.

With that in mind, something doesn’t feel quite right. Even though I know my data will always be free, it’s possible that users with private accounts are using my data for something, but at the same time I don’t get to see what they’re doing or what data they have. I guess I’m selfish.

Okay, there I said it. I’m glad I got that off of my chest, but at the same time I feel wrong and dirty for saying it. I am sorry. I want to like Swivel. I really do. However, until some changes are made, I’m going to have to pass.

9 Comments

  • Thanks for your very thoughtful post about Swivel. We whole-heartedly agree with your concerns about visualization, data quality and statistical validity.

    Swivel has been alive for less than 9 months and we really see ourselves as approaching the start line not the finish line. We hope to address many of your concerns over the coming months.

    We are a business and it is our deeply held belief that we will do well by doing good. Creating a Web site that allows people to engage with data in a meaningful way is the key to enabling great decisions and improving people’s lives both in the private sector and the public sector.

    Brian Mulloy
    CEO & Co-founder
    Swivel
    [email protected]
    415.680.3641

  • Nathan, Swivel has a lot going for it and, as you pointed out, it also has challenges. All the data visualization sites are in the early stages, so each one has ability to respond to feedback. Data360 is website where people can find, analyze and present data. We see ourselves as more serious than Swivel and we think that our analytical, presentation and reporting features are more robust, as well as more dependent upon the judgment of the person posting the data. My background is business finance and strategy. I would be grateful for your review of our site, which can be found at http://www.data360.org. Best regards, Tom Paper

  • @Brian: Thanks for hearing me out. I know it might seem like I’m completely against you and Swivel, but you should know that I’m rooting for Swivel’s success. I just hope that it’s success for all the right reasons.

    @Tom: Data360 (among others) have been in my field of vision for a while now, so a review of some sort is definitely on my todo list.

  • Hi Nathan,

    I work at the OECD. In 3 words my job there is to publish official data.

    Early on at the OECD we have been interested in working with Swivel as we like their approach.

    Traditionally, we publish large & complex datasets complete with detailed source & methods documentation. Obviously we are going to continue doing that but we feel that people outside of statistical offices, central banks and research labs should also get access to numbers from the OECD that they can comprehend.

    Enter swivel. As you can guess when I first tried it I had a number of gripes. obviously the data visualization tools were not as powerful as what we have inhouse. plus it seemed much easier to build a completely irrelevant graph than something meaningful. Yet there’re quite a few things that swivel did very well: allow us to add extra information to our heart’s contents, tags, interact with users via comments, etc. So while I was less than thrilled by the content we could post I was quite excited by how it could be published.
    After a while Swivel and us started talking. We were so concerned that people could produce junk graphs out of our content which would still bear our brand, that together we came up with the concept of “official source”, which they implemented in a couple of days. Then, each time we provided feedback on content & how to do stuff better on the purely statistical front they listened to us and we saw the application evolve very rapidly. We still talk and exchange ideas. Furthermore, ever since we started exploring swivel, quite a few other data-oriented IGO’s and official data providers have been curious. We talk together, they talk to swivel, who proves to be very receptive. Swivel are working on quite a few ideas which I believe would dramatically improve the site & the service and some of these ideas and suggestion come from veteran statisticians all around.

    Eventually, swivel caters to a wide audience, so they’ll focus on the kind of data that’s understandable by many, rather than on expert analysis. You’ll still see fun graphs put forward which may hurt the credibility of more relevant data. Yet I believe that it has a role to play. At the very least it challenges the way we “official” data people consider publishing data, which is already quite an accomplishment!!!

  • @vozome: You bring up some excellent points and it’s GREAT that the OECD is aiming to make their data more available. I agree that Swivel is playing a good role in changing the way people consider publishing data.

    As strictly a data store, Swivel is on the right track, but is this what Swivel aims to be? I don’t think it is. Swivel markets itself as an application to upload and explore data.

    With that in mind, how many people in Swivel’s wide audience will download the data and explore with their desktop tools? That’s when all the Swivel graphs start to trouble me, because, as you noted, they’re not especially useful and many (or most) are useless. Then the trouble begins when people start “interpreting” the graphs, perhaps seeing something that isn’t really there.

  • I have nothing much to add to the discussion, besides mentioning it is very interesting to read – thank you for your post.