Shady research from Harvard scientist Marc Hauser is confirmed:
On Friday, Michael D. Smith, dean of the Harvard faculty of arts and sciences, issued a letter to the faculty confirming the inquiry and saying the eight instances of scientific misconduct involved problems of “data acquisition, data analysis, data retention, and the reporting of research methodologies and results.” No further details were given.
This is why we don’t just accept any old data and why we care about the methodology behind the numbers. Stuff like this always reminds me of an exam question that asked us to investigate the data from an article in a prominent scientific journal. The analysis was all wrong.
Sometimes data is wrong out of ignorance. Other times it’s wrong because people make stuff up. I can understand the former, but why you would ever do the latter is beyond me.
[via]
Update: More details on what happened from research assistants’ point of view on the Chronicle. [thx, Winawer]
Thanks for adding a bunch of banal, moralistic cliches of commentary to what is already an old story.
You’re welcome.
reminds me of the cliche “Statistics never lie, but liars use statistics…”
> Sometimes data is wrong out of ignorance. Other times it’s wrong because people make stuff up. I can understand the former, but why you would ever do the latter is beyond me.
In a review of David Goodstein’s new book, “On Fact and Fraud: Cautionary Tales from the Front Line of Science,” Michael Shermer, in the July issue of “Scientific American” provides at least part of the answer to your puzzlement: “Knowing that scientists are highly motivated by status and rewards, that they are no more objective than professionals in in other fields, that they can dogmatically defend an idea no less vehemently than ideologues and that they can fall sway to the pull of authority allows us to understand that in Goodstein’s assessment, ‘injecting falsehoods into the body of science is rarely, if ever, the purpose of those who perpetrate fraud. They almost always believe that they are injecting a truth into the scientific record….’ From his investigations Goodstein found three risk factors present in nearly all cases of scientific fraud. The perpetrators, he writes, ‘1. Were under career pressure; 2. Knew, or thought they knew, what the answer to the problem they were considering would turn out to be if they went to all the trouble of doing the work properly; and 3. Were working in a field where individual experiments are not expected to be precisely reproducible.’ “
“but why you would ever [make stuff up] is beyond me”
Easy for you to say. You’re interested in how the data looks; how accessible it is on the page. How accurately the presentation represents the numbers beneath. The content is nearly worthless to you.
The people who spend their lives actually generating the data are more interested in what it indicates (or implies). And if one set of data indicates a breakthrough/professional fame/grant money, and another set indicates the status quo/obscurity/diminishing funding…well, it’s easy to see why data might have a mysterious preference for the former.
Are you defending this?
Not defending–just incredulous at your incredulity.
Though as Brad mentions above, only a small minority of bad science is intentionally fraudlent. Personally, I’m never surprised to discover that any research like this is ‘bad’–because I’ve spent enough time at the graduate level in social sciences to see firsthand how shoddy the methodology is in 95% of research. And not just student research, but tenured faculty research.
I’m somewhere in between you and the professors. I’m not personally invested in individual outcomes, and I don’t worry overmuch about tweaking the data presentation (though it is a hobby on the side). What I DO care about is getting the research methodology right.
In the past, I’ve occasionally commented on this site about dataviz that stuck me as intentionally misleading, or from compromised sources. You haven’t typically seemed sympathetic. But there IS is a moral dimernsion to presenting others’ (bad) data. Bad data with beautiful presentation is worthless at best, and propoganda at worst. Questioning the validity of data should be just as urgent for anyone seeking to “improve” the presentation of others’ data, as it would be for their own.
Nathan, chowder’s point, I think, is that as a data visualizer you don’t have ethical conflicts that involve whether you will be able to feed your children, pay your 30 year mortgage or otherwise lose professional momentum.
Those people that work in the bloody waters of the sharks are tempted often. Rather than judge them solely on the quality of their data, how about adding some empathy.
I understand the motivations. What I don’t understand is how someone can do this knowing the repercussions. It doesn’t just affect the individual, but everyone else’s work as research builds off of other research – not to mention the grad students who work in the lab.
“,,, why you would ever do the latter is beyond me.”
Think “Climategate” and C.S.U. (Unless of course you missed this little kerfuffle.
That little “kerfuffle” actually amounted to little in terms of the science, and a lot in the media (who failed to then conduct a real investigation, let alone report on the outcomes of the investigations into the allegations raised).
Unsurprising really.
Am the only one that thinks this whole story is ironic in a life is stranger than fiction sort of way…
The prof. in question published data on the morality of monkeys. On the third attempt to narc the guy out, the students “stole” the data to get Harvard to start an investigation. The prof. in question reportedly falsified data… on the morality of monkeys.
There was a Chronicle piece on this that had some interesting details on the matter…
http://chronicle.com/article/Document-Sheds-Light-on/123988/
I encourage anyone following this story to check out the discussion going on at Language Log (http://languagelog.ldc.upenn.edu/nll/). As Mark Liberman points out many times, this is a shining example of the importance of moving towards reproducible research (http://en.wikipedia.org/wiki/Reproducibility#Reproducible_research), where authors include raw data, code, etc necessary for reproducing results.
Direct language log link here: http://languagelog.ldc.upenn.edu/nll/?p=2565
Pingback: links for 2010-08-25 : Bob Plankers, The Lone Sysadmin