Exploring data to form better questions

April 18, 2019

Topic

Statistics  /  , , ,

Feeding off the words of John Tukey, Roger Peng proposes a search for better questions in analysis:

The goal in this picture is to get to the upper right corner, where you have a high quality question and very strong evidence. In my experience, most people assume that they are starting in the bottom right corner, where the quality of the question is at its highest. In that case, the only thing left to do is to choose the optimal procedure so that you can squeeze as much information out of your data. The reality is that we almost always start in the bottom left corner, with a vague and poorly defined question and a similarly vague sense of what procedure to use. In that case, what’s a data scientist to do?

Story of my life.