Context makes data useful. Without it, it’s easy to get lost in numbers that mean little, but finding the context of data isn’t especially straightforward. Catherine D’Ignazio explains why it’s so hard and what data journalists (or anyone trying to understand data) can do about it:
First of all, data are typically collected by institutions for internal purposes and they’re not intended to be used by others. As veteran data reporter Tim Henderson, quoting Drew Sullivan, said to the NICAR community, “Data exists to serve the bureaucracy, not the journalist”. The naming, structure and organisation of most datasets are done from the perspective of the institution, not from the perspective of a journalist looking for a story. For example, one semester my students spent several weeks trying to figure out the difference between the columns ‘PROD.WASTE(8.1_THRU_8.7)’ and ‘8.8_ONE-TIME_RELEASE’ in a dataset tracking the release of toxic chemicals into to the environment by certain corporations. This is not an uncommon occurrence!