Twitter trend detection algorithm

Posted to Statistics  |  Tags: , ,  |  Nathan Yau

Stuff happens, and people tweet about it. Something major happens, and a lot of people tweet about it. Masters student Stanislav Nikolov and his adviser Devavrat Shah are working on ways to algorithmically detect the latter.

People acting in social networks are reasonably predictable. If many of your friends talk about something, it’s likely that you will as well. If many of your friends are friends with person X, it is likely that you are friends with them too. Because the underlying system has, in this sense, low complexity, we should expect that the measurements from that system are also of low complexity. As a result, there should only be a few types of patterns that precede a topic becoming trending. One type of pattern could be “gradual rise”; another could be “small jump, then a big jump”; yet another could be “a jump, then a gradual rise”, and so on. But you’ll never get a sawtooth pattern, a pattern with downward jumps, or any other crazy pattern.

And with that, the algorithm compares current patterns to the ones above. If they look like a trending pattern, the algorithm marks something as a trend with some probability. In testing with past trending topics, the algorithm was able to pick correctly over 90 percent of the time.

The best part is that this method can be applied to other time series data. “We can try this on traffic data to predict the duration of a bus ride, on movie ticket sales, on stock prices, or any other time-varying measurements.”


Years You Have Left to Live, Probably

The individual data points of life are much less predictable than the average. Here’s a simulation that shows you how much time is left on the clock.

Jobs Charted by State and Salary

Jobs and pay can vary a lot depending on where you live, based on 2013 data from the Bureau of Labor Statistics. Here’s an interactive to look.

Reviving the Statistical Atlas of the United States with New Data

Due to budget cuts, there is no plan for an updated atlas. So I recreated the original 1870 Atlas using today’s publicly available data.

Where Bars Outnumber Grocery Stores

A closer look at the age old question of where there are more bars than grocery stores, and vice versa.