Philosophy of data

Posted to Statistics  |  Tags:  |  Nathan Yau

David Brooks for The New York Times on the philosophy of data and what the future holds:

If you asked me to describe the rising philosophy of the day, I’d say it is data-ism. We now have the ability to gather huge amounts of data. This ability seems to carry with it certain cultural assumptions — that everything that can be measured should be measured; that data is a transparent and reliable lens that allows us to filter out emotionalism and ideology; that data will help us do remarkable things — like foretell the future.

Be sure to read the comments. There’s actually quite a bit of anti-data talk.


  • What do you mean by ‘anti-data’ talk? I didn’t see anyone saying ‘data is bad’ or ‘we don’t need no freakin data’ or anything like that. There was some there’s no pure objectivity, or that it has no meaning without interpretation, but that’s just true. It isn’t ‘anti’ to say that data don’t speak for themselves. Or is such a statement anti-data in your opinion? (fwiw, among other things I teach research methods and regularly point students to this blog; I certainly don’t consider myself anti-data but definitely want students to be skeptical and alert to the ways data gets manufactured and bent to specific ends).

    • Maybe you could count this interesting comment (heavily abbreviated,

      ‘ Much of this is nonsense. Basketball players–indeed people in many fields–have hot and cold streaks…any fool knows that in the arts and sciences we do not go by any career average but the high points, the Mona Lisa moments, the Wright brothers inventing the airplane etc. Plenty of artists do only a couple significant things–we judge by that and not by any “career average”. I suppose a newspaper columnist though would prefer career average…to call all exceptional accomplishment “just random noise”…Data is increasing, but so often I see obvious errors in interpretation of data. ‘

      Not anti-data, but skeptical towards a certain mode of thinking about data.

      I generally agree with that comment, with one thing to add: Society and history might judge these people by their flashes of genius, but there are plenty of contexts where people would be interested in the career average. If you were hiring such a person, for example, you’d probably want to know what the approximate chances were of a moment of inspiration landing in their contract with you, and you’d probably want to know how valuable (if at all) their average-quality output would be in the long stretches between flashes of inspiration, to judge how much risk that investment is worth.

      • Nothing wrong with making an informed decision but we must not forget quality carries value regardless of validation.

    • On the contrary, I see a fair bit of what I’d call “anti-data” talk. There are a quite a few people who seem angry that data are demonstrating facts contrary to their existing thinking. The inability of people to be flexible in their thinking in the face of mounting evidence is “anti-data” in my view (and it’s rampant in this country).

      When working with data, skepticism should be a tool, not an emotion.

      I was actually quite surprised at a lot of the comments – particularly given the point of the column is not to form a conclusion, but to say “this is worth investigating and I intend to do so”. I look forward to Brooks’ take on the subject…

      • Got any examples? I’m struggling to find any. I see a lot of anti-David Brooks talk, mostly at the flakiness of some of the conclusions he’s jumping too in his “odd grab bag of thoughts and anecdotes” ( A lot of this criticism sounds reasonable.

        Are you referring to the many teachers saying essentially “I don’t care what the data says about tailoring teaching to individual students – I do it every day and it works”? I’ve studied educational psychology research of the type Brooks casually quotes, so I sympathise. The quality of the research is very often terrible – cargo cult quantitative data collection at its worst. The idea of construct validity barely exists in that corner of research.

        The experimental research is very often comparing a “don’t interfere, let the teachers teach as they have done for years” condition against “impose some half-baked straw man version of X theory on the teachers and give them no time to figure out how to make it work for their students”, with tiny samples, so it’s hardly surprising that the results are often negative. Research based on observing wider trends usually neglects the fact that it’s ‘difficult’ schools or schools undergoing organisational shakeup that jump to adopt new practices, and that teachers who seemingly haven’t adopted these new practices have usually read about them and incorporated some of the ideas in a gradual organic way already.

        Negative results tend not to really mean that the idea is a bad one: rather, that heads forcing teachers to reinvent their lesson plans from scratch based on some book they’re forced to read works less well than letting them figure out how to incorporate and adapt new ideas into their lesson plans gradually.

        There are some great researchers and some great studies in educational psychology, but there are many who think that so long as they collect some numbers, any numbers, then process them correctly, that’s all that is needed to be Real Science and they can slap whatever labels and conclusions onto those numbers that they like. Data needs to work alongside, not instead of, qualitative research and critical thinking.

  • Lot of noise for not so much.
    As one of the comment summed up well, it’s all about interpretation.
    I don’t think data is objective, at all, I learned it during my end year school project when I created a new scale.
    And I think the reason why people disagree so much is because they think data is objective. But it isn’t, it is just another tool used by humans on humans numbers.
    Don’t get me wrong, I love data, but it is what you make of it that makes it what it is to the eyes of the others.


Pizza Place Geography

Most of the major pizza chains are within a 5-mile radius of where I live, so I have my pick, …

Shifting Incomes for American Jobs

For various occupations, the difference between the person who makes the most and the one who makes the least can be significant.

Watching the growth of Walmart – now with 100% more Sam’s Club

The ever so popular Walmart growth map gets an update, and yes, it still looks like a wildfire. Sam’s Club follows soon after, although not nearly as vigorously.

Jobs Charted by State and Salary

Jobs and pay can vary a lot depending on where you live, based on 2013 data from the Bureau of Labor Statistics. Here’s an interactive to look.