Which sorts of posts get more blog comments?

Paul Alper writes:

Some of your blog postings elicit many responses and some, rather few. Have you ever thought of displaying some sort of statistical graph illustrating the years of data? For example, sports vs. politics, or responses for one year vs. another (time series), winter vs. summer, highly technical vs. breezy.

I’ve not done any graph or statistical analysis. Informally I’ve noticed a gradual increase in the rate of comments. It’s not always clear which posts will get lots of comments and which will get few, except that more technical material typically gets less reaction. Not because people don’t care, I think, but because it’s harder to say much in response to a technical post. I think we also get fewer comments for posts on offbeat topics such as art and literature. And of course we get fewer comments on posts that are simply announcing job opportunities, future talks, and future posts. And all posts get zillions of spam comments, but I’m not counting them.

As of this writing, we have published 10,157 posts and 146,661 comments during the 16 years since the birth of this blog. The rate of comments has definitely been increasing, as I remember not so long ago that the ratio was 10-to-1. Unfortunately, there aren’t so many blogs anymore, so I’m pretty sure that the total rate of blog commenting has been in steady decline for years.

59 thoughts on “Which sorts of posts get more blog comments?

  1. The gradual increase in comments probably goes along with an increase in views. Comments / views might be a more relevant metric. Also, lots of comments are not always a good thing. If you say something controversial or incorrect, then people (like me) jump into the comments to correct you. If what you say broadly makes sense, then I just read and move on. Rarely, if what you’re talking about is complicated enough and I don’t have an interest in the subject, I skim and move on.

    • There are a ton a cooking blogs that seem to come up when I search for recipes.

      So maybe commercial blogs are bigger than non-commercial blogs? And maybe there are even more people doing non-commercial communication, but they just use other things than blogs as outlets.

      I assume there are more things called cabins now than there were in 1821, but a cabin is a thing you rent on vacation now rather than a home.

    • Blogging isn’t dead and it won’t die for the next decade either. However, the way blog posts are consumed today is wildly different compared to how they were consumed ten years ago. When you search for anything on Google that is even slightly specific, the first page is likely to show you… a blog post. Also, traffic numbers are up for both news and blogging sites in 2021. People still read blogs, they just use them differently.

  2. My guess is you would need to do a careful analysis, as I think the distribution is highly-skewed due to the fact that certain topics that seem to trigger 2-3 people (and they trigger each other) who don’t appear to know when to stop, and do not appear to be aware of a sort of unwritten etiquette. In fact, if I look a the “Recent Comments” as I write this (clearly this will be different as people read this), you can see it in action.

  3. I tried to scrape this page real quickly just to see whether it’d be easy enough to get a comment count/description and I got an unexpected error message (403 Forbidden). Is that intentional/you don’t want people doing this, or unintentional/fight with the code some more?

  4. I offered up this topic because I thought it was counterintuitive that a statistics blog did not have “numbers” pertaining to how well it was doing. Naturally, any criterion chosen would be highly subjective; success, so to speak, is in the eyes of the beholder(s). Nonetheless, statistical techniques were created for evaluating all sorts of concrete and abstract endeavors. In particular, some sort of comparison might be made with competitors in the field.

    • IF there were an analysis as to how well the blog was doing, I would expect that analysis to contain numbers (and graphs!) But an important lesson of this blog is that you have to be willing to put in the work. No one has, and there is nothing in the theory of statistics that says anybody has to be interested enough to do the hard work of quantifying something if they don’t see the end result as worth the effort put in. (Actually, that’s half statistics, half economics.) If you see the virtue of doing so, do it yourself, fund someone else to do it, or convince someone else to use their time or money to do so.

    • “I thought it was counterintuitive that a statistics blog did not have “numbers” pertaining to how well it was doing. ”

      Perhaps not:

      Andrew’s interest in blogging seems to be mostly in promoting a better understanding of statistics, particularly when it comes to important issues. Slicing up the blog readership wouldn’t be much of a contribution to that mission.

      Also I suspect he already spends much of his time doing statistical mathematics. Slicing and dicing the blog readership data is probably one of the least productive and least interesting things he could do with statistics.

  5. I’m not a regular commentator, but do read pretty regularly, because the comments generally add a lot of value. This seems to be a clear difference to many other blogs, where comments often quickly degenerate into personal vendettas.
    Several possible reasons:
    — The subject matter is not insanely popular,
    — The daily postings don’t resemble click-bait, as so many do on some other blogs
    — The comments policy (essentially laissez faire, I think) promotes responsible commenting.

    So, thanks, I think this blog is a valuable resource.

  6. Here are some statistics for the ~ 1.5 years of this blog (after a bit of scraping/processing)
    I give the word, then 16th percentile of number of comments if the post has it, 50th percentile, 84th percentile.
    The table is sorted by the median number of comments:
    I only show words that were encountered in more than 10 posts.

    uk 15 122 182
    flu 40 110 151
    epidemiologist 27 102 208
    fraction 24 101 166
    spreading 25 101 175
    immunity 58 101 169
    unreasonable 36 99 182
    p.p.p.s 48 98 165
    epidemic 22 98 165
    approved 27 89 196
    rapid 35 87 121
    infected 30 86 170
    fatality 30 84 178
    outbreak 11 82 120
    crude 12 81 169
    lockdown 26 81 121
    virus 24 78 165
    perfectly 16 78 122
    retrospect 40 77 142
    caveat 21 77 140
    geography 33 77 147
    dying 29 76 184
    die 31 76 166
    symptom 25 75 170
    staying 27 74 116
    restriction 20 73 163
    clinton 35 72 96
    two-party 49 72 93
    screwed 27 72 159
    sensible 19 71 121
    italy 43 71 176
    scared 27 70 127
    miss 29 70 192
    detected 14 69 137
    biden 19 69 122
    closer 22 68 95
    comparable 11 68 131
    south 10 66 156
    joe 30 66 141
    commenter 20 66 116
    overestimate 17 65 128
    +/- 29 65 102
    silver 12 65 97
    indicates 30 65 73
    vulnerable 7 65 125uk 15 122 182
    flu 40 110 151
    epidemiologist 27 102 208
    fraction 24 101 166
    spreading 25 101 175
    immunity 58 101 169
    unreasonable 36 99 182
    p.p.p.s 48 98 165
    epidemic 22 98 165
    approved 27 89 196
    rapid 35 87 121
    infected 30 86 170
    fatality 30 84 178
    outbreak 11 82 120
    crude 12 81 169
    lockdown 26 81 121
    virus 24 78 165
    perfectly 16 78 122
    retrospect 40 77 142
    caveat 21 77 140
    geography 33 77 147
    dying 29 76 184
    die 31 76 166
    symptom 25 75 170
    staying 27 74 116
    restriction 20 73 163
    clinton 35 72 96
    two-party 49 72 93
    screwed 27 72 159
    sensible 19 71 121
    italy 43 71 176
    scared 27 70 127
    miss 29 70 192
    detected 14 69 137
    biden 19 69 122
    closer 22 68 95
    comparable 11 68 131
    south 10 66 156
    joe 30 66 141
    commenter 20 66 116
    overestimate 17 65 128
    +/- 29 65 102
    silver 12 65 97
    indicates 30 65 73
    vulnerable 7 65 125
    antibody 30 65 171

    antibody 30 65 171

    It’s clearly pandemic dominated. I was expecting to see Trump here, but surprisingly didn’t find it.

      • I don’t think Sergey included the number of posts in the table though, right? Just some quantile number of comments among those with the word. So it’s suggesting that there is a larger median number of comments for more postscripts, if I’m understanding correctly.

      • I don’t think anything is wrong as I rank by median number of comments for a given word.
        I.e. if the word PPPS is in N posts. I compute the median number of comments over those N and show that.
        I think in the case of PPPS it looks like those only happened in very popular posts.

  7. Seems like blogs are making a bit of a comeback with Medium and especially Substack. Hopefully that holds… anything is better than the level of conversation on Twitter…

    • As I’ve written many times, blogs > twitter. But it’s an interesting question, why is twitter so bad?

      Just to get this out of the way, yes, twitter has a lot going for it. It’s a great way for people to communicate, the brevity of the posts allows a twitter user to see lots of posts at once, and it seems to be well organized by topics so you can easily navigate to a flame war on just about any topic. The bad thing about twitter is that it often seems to bring out the worst in people. Even the most reasonable people, when they go on twitter, seem to get involved in pointless disputes.

      I do wonder whether there’s something about the interactions on twitter that encourage this sort of flame-enhancing behavior. Actually I have a feeling this would happen to me too. I have some of my favorite targets here (Weggy, gremlins, pizzagate, the sleep scientist, etc.), but the people I criticize usually aren’t interested in engaging on this blog. Or, if they do, they’ll only take a couple shots at debating before they give up. Fair enough; they have no obligation to respond, any more than Malcolm Gladwell and John Gottman have any obligation to respond to the devastating criticisms from Laurie Abraham. So I raise these potentially controversial topics when they’re relevant, but then we can move on. On twitter, though, it’s harder to avoid criticism. If I were criticizing all these people on twitter like I do on the blog, someone would tag them or “at” them or whatever it is, and they might well feel compelled to respond. If they didn’t respond, their followers might start bugging them—or they might start bugging me! There’d be little room to discuss the merits of the issue but lots of room for people to take sides. Conversely, if I were on twitter and someone made an unfounded attack on my work (yes, it happens!), then I might decide to respond, and, even if I didn’t, one of my twitter allies might respond, and all of a sudden it’s a flamewar, and then if person X is mad at me, then, by the transitive property, person Y who is friends with X will decide that I was a bad person all along, and then distasteful person Z who happens to hate X will claim me as an ally, . . .

      My point is not that this sort of side-taking is unique to twitter. We see it all the time in the regular news media and on blogs too. No, my point is that twitter seems to keep these sorts of disputes alive and on the front burner, and on blogs the flame seems to die out faster.

      Just as an example, the other day on the blog a commenter criticized me for “misrepresenting” what someone said. That annoyed me, because I didn’t see any evidence that I misrepresented anything! So I responded in the comments section. The original commenter did not reply, I guess because he never went back to check. But on twitter this could’ve gone on for days, with various third parties chiming in, nobody ever offering any evidence, but then we’d see people criticizing the commenter for not offering evidence, then other people criticizing the critics for punching down on the commenter, etc etc.

      So my problem with twitter is partly that it doesn’t typically seem to give people the room to make any arguments or offer much in the way of evidence, and second that just about any dispute seems like it can spiral out of control. This might not be inherent to twitter; maybe it’s just how it developed. And of course there could be all sorts of wonderful things on twitter that I’m not focusing on. I’m not claiming that twitter is a net negative; who am I to say? Lots of people seem to find it very useful. I’m just saying I don’t like it.

      • “The original commenter did not reply, I guess because he never went back to check. ”

        This and what follows it is the driving difference, I think.

        1. Twitter notifies you of activity related to things you’ve interacted with. This is inherent to social media platforms (which I do not consider blogs to be.)
        2. It’s too annoying to be detailed and clear in a reply there, and long replies and chains of them are too convoluted to follow easily… which makes it easier to just jump in without actually following through the conversation.

        I wish I could tolerate the annoying/stressful things to take advantage of the good parts (community building, sharing useful resources, being informed of events of interest), but I can’t so I just don’t even look.

        • Twitter does have that advantage: it stops people from being too verbose.

          But it’s taken that to the other extreme. With all the 1/n posts which are a hack around the Twitter word limits.

          Bloggers forget that a good, nuanced (long) explanation only helps the audience if they are going to read all of it!

          One blogger that comes to mind is slate star codex. Way too verbose. The content is good but if only the guy figured out brevity!

          Marginal Revolution and Andrew seem to get the length just right at its Goldilocks point.

        • Perhaps some kind of system where every comment is run through a flame filter and also has to be at least 300 words but less than 1000

          Twitter is just full of 7 word “me too” type posts or just straight up retweet amplifying

        • Sure, Twitter has many of these positives and negatives, but it’s not the only low barrier-to-entry social media platform out there. Reddit is often where long-form social interaction takes place among strangers these days. Interactions can be more nuanced and technical compared to Twitter, sometimes on par with a blog like this. Troll-like comments can eventually be auto-filtered out of view through downvotes. Quality comments can rise to the top of the feed (and/or remain in view) through upvotes. For a given thread, comments can be sorted by recency, popularity, “best”, etc. Across threads (posts), an individual post can rise to the top based on activity–there’s even a decay function built in that limits how long a popular post remains at the top of the rankings. There are downsides to this system as well, I think. (Comments can devolve into a popularity contest, of sorts.)

        • Oh yes, I think reddit is quite good.

          Affixing “reddit” or “stack exchange” to a certain type of google search query yields especially great results.

          Eg most tech reccomendations etc.

      • Following up on this discussion: another advantage of twitter is that there is a low barrier to entry. Sure, anyone can start up a blog too, but it probably won’t get any notice. But if you’re a statistician, say, and you want to become noticed on twitter, you can make a splash by posting Bayes Rulez! or P-Values Suck! or connect statistics to some contentious political position, and tag a bunch of active twitter personalities. This won’t in itself get you thousands of followers—there’s still a greasy pole of success that would have to be climbed—but you can at least be part of the conversation without much initial investment required. Blogging were that way in 2002 or 2003: you could start a blog, post reactions to other blogs, include lots of links, and you’d be part of the conversational web. But not so much anymore.

        So I can see the appeal of twitter in that it lets anyone jump in and join the conversation. I’m just not so happy about what those conversations look like.

      • The character limitation on Twitter discourages nuance in discussions. Nuance results in unreadably long threads. Blogs develop communities who help self-enforce standards of conduct. On Twitter, people will see trending tweets and jump in. It’s so largely used that the “standard of conduct” is the lowest common denominator.

        I think Meg’s point about notifications is also relevant. Flames the fire when there is uncivil discourse.

        I pretty much only look at Twitter when a blogger I respect links to a thread of interest. That’s the best way to get a high signal-to-noise ratio, I’ve found. But I’ve never replied to anyone there, while I’ve (obviously) left comments on blogs.

  8. If someone had the time (or an interested student) it could be a very interesting analysis using topic modelling to identify topics across the corpus of posts and see if these were predictive of topics from across the corpus of comments. You could of course reverse the prediction to predict comment topics from post topics. That implies separate topic models for the posts and comments but you might be able to do it with a single model built from the whole corpus of posts and comments combined. Roberts stm package in R could be great!

    Roberts, M. E., Stewart, B. M., & Airoldi, E. M. (2016). A Model of Text for Experimentation in the Social Sciences. Journal of the American Statistical Association, 111(515), 988–1003. https://doi.org/10.1080/01621459.2016.1141684

  9. I am going to play the heathen here. This blog does not try to sell anything, it is not attempting to be propaganda and it does not have adverts. So I have to ask if the question of ‘how well a blog is doing’ is even meaningful, particularly if measured by comments. It strikes me that as long as the authors are happy with their content, then the blog is doing a great job of allowing them to talk to the world. I realise that everything in social media is targetted on trying to gain followers, get likes, but this is all based on the desire to sell something, not just being a platform for communicating ideas. Sometimes it is good to enjoy something for what it is rather than judging it on how many other people are also liking it. But maybe I belong in some other century.

    • I still post occasional stuff to my blog. But it’s mostly example analyses, graphs, or descriptions of ways I’ve solved problems. There isn’t enough content to drive regular viewers and so there is no or very little conversation. But a major reason to participate in blogs for me is conversation with like minded people. The reason I come here still is because of the people and less so the individual topics. I’m hoping once COVID dies down we can talk more about mathematical modeling or decision analysis or policy and less about COVID, plagiarism, and psychologists who are totally self unaware of their utter failure to do real research

  10. If anyone wants to study this
    This are the most commented words by year:
    https://gist.github.com/segasai/cce03527079014dfaea300e1121aa622
    The columns are year, the word, the 16 percentile of number of comments if the word is present, 50th percentile, 84th percentile and the number of posts with this word.
    Only words with > 5 posts were used. And For each year I give the top 100 words.

    Here is the subset with top 3 words for each year.
    2010,made,5,20,31,6
    2010,person,5,19,30,7
    2010,american,3,19,27,9
    2011,copied,6,30,43,6
    2011,tenured,9,30,32,7
    2011,pressure,11,30,39,7
    2012,animal,33,48,76,6
    2012,clue,32,40,55,7
    2012,economically,14,39,44,6
    2013,persists,10,51,79,6
    2013,excluding,27,51,99,7
    2013,restricted,31,51,79,7
    2014,agreement,42,75,201,9
    2014,sexual,41,71,164,6
    2014,clearer,14,70,136,10
    2015,’s,6,71,132,6
    2015,yup,45,69,115,8
    2015,percent,15,64,89,13
    2016,trusted,15,109,191,6
    2016,satoshi,35,109,199,6
    2016,discredited,27,109,191,6
    2017,time-reversal,32,104,198,6
    2017,losing,18,90,106,7
    2017,terrorist,28,88,131,6
    2018,frustration,6,59,132,6
    2018,motivating,47,55,124,6
    2018,commenting,8,54,123,7
    2019,objection,45,181,323,6
    2019,listing,21,119,323,6
    2019,racism,72,110,120,6
    2020,herd,91,152,272,6
    2020,oz,72,152,360,9
    2020,sweden,28,139,207,6
    2021,pointed,71,108,190,6
    2021,dr.,13,85,182,6
    2021,lockdown,26,83,190,6

Leave a Reply to Sergey Koposov Cancel reply

Your email address will not be published. Required fields are marked *