Algorithms to fix underrepresentation on Wikipedia

September 12, 2018

Topic

Statistics  /  , ,

Wikipedia is human-edited, so naturally there are biases towards certain groups of people. Primer, an artificial intelligence startup, is working on a system that looks for people who should have an article. It’s called Quicksilver.

We trained Quicksilver’s models on 30,000 English Wikipedia articles about scientists, their Wikidata entries, and over 3 million sentences from news documents describing them and their work. Then we fed in the names and affiliations of 200,000 authors of scientific papers.

In the morning we found 40,000 people missing from Wikipedia who have a similar distribution of news coverage as those who do have articles. Quicksilver doubled the number of scientists potentially eligible for a Wikipedia article overnight.

Then, after it finds people, it generates sample articles to get things started.