Maybe there’s something to this whole data science thing after all. Mike Loukides describes data science and where it’s headed on O’Reilly Radar. It’s a good read, but statisticians get clumped into suits crunching numbers like actuarial drones:
Using data effectively requires something different from traditional statistics, where actuaries in business suits perform arcane but fairly well-defined kinds of analysis. What differentiates data science from statistics is that data science is a holistic approach. We’re increasingly finding data in the wild, and data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others.
What is data science? It’s what well-rounded statisticians do.
hey careful! some of us actuarial drones read this blog too you know :)
i wasn’t saying that all actuaries are drones – just the ones described in the article :)
The quote from Mike Driscoll (@dataspora) and the point that modelling is key and “[d]ata science isn’t just about the existence of data, or making guesses about what that data might mean; it’s about testing hypotheses and making sure that the conclusions you’re drawing from the data are valid” make up for the jab at “plain” statisticians.
plus a hug for flowingdata. i did say it was a good read :)
‘I love it when a plan comes together’. I like this part: ;-)
“Where do you find the people this versatile? According to DJ Patil, chief scientist at LinkedIn (@dpatil), the best data scientists tend to be “hard scientists,” particularly physicists, rather than computer science majors. Physicists have a strong mathematical background, computing skills, and come from a discipline in which survival depends on getting the most from the data. They have to think about the big picture, the big problem. When you’ve just spent a lot of grant money generating data, you can’t just throw the data out if it isn’t as clean as you’d like. You have to make it tell its story. You need some creativity for when the story the data is telling isn’t what you think it’s telling.”