The Like Log Study: Buzzwords and engagement

The Web is a game of pageviews, and outlets such as Twitter and Facebook are a way to rack up the counts. The more people who share your posts and articles, the more new people that visit your site. So what kind of articles are shared more often? How do people with interact with these articles? Yahoo! research scientist Yury Lifshits digs into Facebook likes for some ideas, using data collected from 45 sites, 100k+ articles, and 40 million reactions, between October 2010 and January 2011.

In a nutshell, The New York Times had the most likes overall (6.8 million), however, TechCrunch has the highest median likes at 498. The story with the most likes during the period? The Wall Street Journal’s story “Why Chinese Moms Are Superior” with 340,000 likes. Wowsa.

Watch the video below for more:

By the way, in case you’re wondering. Yes, Worldle is currently the de facto method for non-creatives to feel creative. Jump on that wagon before it’s too late.

[The Like Log Study via Data Pointed]


  • I’m really not convinced by his interpretation of this data. The top ‘like’ stories from the top sites NY Times, BBC, Guardian are Zodiac signs, Chinese mothers, snowman thefts and so on.
    But this isn’t really the ‘top’ stories is it? The top stories are rolling coverage of Libya, Egypt, Japanese earthquake, tsunami and possible meltdown that are keeping everyone glued for updates and new stories. No-ones ‘liking’ them though!- they are liking lighthearted, funny, surprising stories that are a relief from the ‘top’ stories.
    So — to my mind– the ‘like’ stories are real exceptions that you should not draw conclusions about the rest of the content from.

    • Hey Stephen.

      This study is for mid-October – mid-January. Egypt, Lybia, Japan all happened afterwards. Japan is way bigger socially than Chinese mothers trend. I see several stories at 300000+ level.

  • That’s a really great point, Stephen, and I think two issues in one. A friend of mine pointed out that Facebook’s “Like” button (which will now be replacing the “Share” button) is semantically confusing, because who wants to “Like” an article about a disaster? You may want to share that article, but do you Like it? I think subconsciously people would stay away from that button because it doesn’t jive semantically with the context they want to convey the data in.

    Additionally, you make a great point about rolling updates. Even if people were clicking Like for articles about Libya, presumably those Likes would be diluted across a wide range of articles, as there’s a new article nearly daily that one could choose to “Like”. One-off articles about Zodiac signs and Chinese mothers, however, can act as central, non-changing magnets for Likes and thus rise to the top.

  • Indeed Jake- even ‘share’ is not it either as most people will share or like funny or unusual stories with friends but eschew sharing/liking polarising or depressing topics with everyone they know.

    Talking of depressing…what would be depressing is if managers of news sites took these analyses at face value and encouraged their editorial team (as the presenter is implying) to concentrate efforts on the human interest stories– which are OBVIOUSLY so popular on the social networks..

  • Regarding the numbers: Wait, what?!

    That data does not jive with mine at all. I use retweets, likes, tags and comments to curate TechCrunch articles. My raw data inn YAML form is here: The field fb_shares is an array of FaceBook Likes each hour over the first few hours of the article’s life.

    You can see that the median is nearly an order of magnitude lower than 498. (And you can verify my numbers by clicking on the TechCrunch URL right there. TechCrunch’s article also shows how many likes it got.)

    This Yahoo Scientist is seemingly calculating “median” by *averaging* in outliers like, which has six thousand likes, because you stand a chance to win an iPad2 if you “like” it. Otherwise, most posts get fewer than 100 likes.

    This reply was based on what Nathan wrote. I haven’t watched the video yet. It was just that “median” value that got me. Something seems off.

    • Is it possible that I misinterpreted the Rankings table on the site? I took the Median Story column to be median likes for a story.

      • Ah, I’ve watched it now. It’s exactly as Stephen and Jake say, too. The study seems to be flawed in a couple of ways. That’d actually be a great article, too: An analysis of what’s possibly wrong with the study and the conclusions drawn from it.

        And this motivates me to document my TechCrunch curation cronjob, too. My motivation was entirely practical – how to reduce the amount of noise in my news feed, but retain the articles I’d be interested in. Solving a practical problem led me to different conclusions than Yuri’s academic study led him to.

        (Sorry for the typos, especially writing “jive” instead of the correct word, “jibe.”)

      • Hey, let me give some answers here:

        1. Median is median. I am not averaging outliers.
        2. I use _total Facebook counts_ = Facebook shares + Facebook likes + Facebook comments.

        You can verify any count yourself. Here is public API I use:

      • Yury, thanks for the explanation! It was wrong of me to haphazardly guess how you got numbers so much greater than the fb_shares I gather. (But you can see why I was surprised, when I thought they were only likes or shares.) Also, sorry for misspelling your name as Yuri. I’m full of assumptions and misspellings in this thread!