Through the eyes of the algorithm

December 4, 2020

Topic

Statistics  /  , ,

Eugene Wei looks closer at the algorithms that drive TikTok and how its design provided an effective feedback loop:

But for TikTok (or Douyin, its Chinese clone), who needed an algorithm that would excel at recommending short videos to viewers, no such massive publicly available training dataset existed. Where could you find short videos of memes, kids dancing and lip synching, pets looking adorable, influencers pushing brands, soldiers running through obstacle courses, kids impersonating brands, and on and on? Even if you had such videos, where could you find comparable data on how the general population felt about such videos? Outside of Musical.ly’s dataset, which consisted mostly of teen girls in the U.S. lip synching to each other, such data didn’t exist.

In a unique sort of chicken and egg problem, the very types of video that TikTok’s algorithm needed to train on weren’t easy to create without the app’s camera tools and filters, licensed music clips, etc.

At first I was confused by TikTok. I’m still confused by TikTok. But one thing that is for sure is that the system knows how to serve up videos that one might find interesting. Whether that’s good in the long run is anyone’s guess.