Dataset as worldview

March 9, 2020

Topic

Statistics  /  ,

Hannah Davis works with machine learning, which relies on an input dataset to build a model of the world. Davis was working with a model for a while before realizing the underlying data was flawed:

This led to a perspective that has informed all of my work since: a dataset is a worldview. It encompasses the worldview of the people who scrape and collect the data, whether they’re researchers, artists, or companies. It encompasses the worldview of the labelers, whether they labeled the data manually, unknowingly, or through a third party service like Mechanical Turk, which comes with its own demographic biases. It encompasses the worldview of the inherent taxonomies created by the organizers, which in many cases are corporations whose motives are directly incompatible with a high quality of life.