This guest post is by Andrew Gelman from Statistical Modeling, Causal Inference, and Social Science. He answers the question – “What is data and why should we care about it?”
Good data are better than bad data, but worst of all are data whose quality you can’t assess. Beyond this, we want to use statistical methods that allow us to combine data from many sources. I’m comfortable with regression and multilevel models, but other methods are out there too. In any case, we have to care about our data because inferences and decisions are just about always data-based, implicitly if not explicitly. Being the person in the room with the hard data gives you authority, as well it should.
Pingback: What is Data and Why Do We Care About it So Much? | FlowingData