Blogging’s a great way to express your ideas.

Nasir Bashir writes:

I’m a PhD student in biostatistics and have recently started up a blog where I use short stories to explain concepts from statistics in an engaging manner.

For example, my most recent post uses democracy and voting to explain how confidence intervals work.

Cool! I just have a couple of comments, places where I think this post isn’t quite correct.

But before going on let me emphasize that I’m happy to see this new blog, and I encourage everyone who is comfortable with this format to blog your thoughts, for reasons discussed here.

So, although I’m critical of a couple of things in that particular post, I’m positive about the effort.

So now for the details.

In his post, Bashir writes:

Bayesians will happily provide you with a credible interval: the range within which, given your data and your model, there is a 95% probability the true value lies. That is the statement people think they are hearing when someone reports a confidence interval. The catch, of course, is that Bayesian intervals rest on prior beliefs, which can be contentious. There is no free lunch.

That’s fine, but . . . non-Bayesian uncertainty intervals (that’s the term I prefer to “confidence intervals”; Sander Greenland prefers “compatibility intervals”) are based on assumptions too! Indeed, Bayesian inferences are not necessarily based on prior beliefs; they’re based on prior distributions, which are modeling assumptions. If we assume a certain prior distribution, which may be determined based on data (as here), that’s not necessarily something we believe it’s something we’re assuming. Just as if Ronald Fisher or Jerzy Neyman or whoever is doing some math with the Poisson distribution or the logistic curve or whatever, there’s no reason to think they “believe” these models; they’re just using them. And these Poisson distributions and logistic curves should be contentious too! I also recommend our paper, Beyond subjective and objective in statistics.

The other thing is that Bashir writes about opinion polls as if the sampling-based intervals can be taken literally:

There’s the familiar theatre of exit polls, swinging graphics, and pundits talking with the air of a Delphic oracle. “Candidate A leads with 54% of the vote,” they declare, “with a 95% confidence of plus or minus 3 points.” . . . In our election example, imagine running the same poll over and over – new samples of voters each time, same questions, same methodology. . . . The great law of the universe is that, in the long run, about 95% of those intervals will contain the truth. That’s all the “95% confidence” means, not certainty about your one poll, but long-run coverage across infinitely many.

Actually, no! To get 95% coverage, you pretty much have to double the width of the polls’ nominal confidence intervals. Our 2018 paper, Disentangling bias and variance in election polls, provides some empirical evidence of this point.

The ironic thing is that, while telling people to avoid the naive interpretation of a confidence interval, Bashir is naively accepting the interpretation of a poll that excludes nonsampling error.

OK, don’t get me wrong, I don’t think Bashir actually thinks that polls are correct; he’s just trying to correct a statistical misconception. But you have to be careful when explaining things!

That all said, I’m thrilled to see more people blogging about statistics, so I’m glad that Bashir sent this to me. Everyone makes mistakes, nothing wrong with that. It’s a good idea to put your ideas out there clearly enough that your mistakes become visible. Indeed, that’s one of the main purposes of writing.

A final blogging recommendation

Write about your own experiences. If you’re a biostat student and you want to write about statistical ideas, use a biostat example—then you’ll have some local expertise you can draw on. That’s better than writing about polling, if that’s not an area you work on.

I have a similar problem when people use criminal justice analogies to explain hypothesis testing (the hypothesis is “guilty” or “innocent,” etc.). If criminal justice is your area, fine. If not, I think that bringing in the outside analogy just confuses matters.

Remember, God is in every leaf of every tree. You can become a better blogger—and a better researcher—by looking hard at the leaves that are right in front of you.

26 thoughts on “Blogging’s a great way to express your ideas.

  1. Bashir makes a point (as do does Andrew in the linked BMJ article) that I’ve always kind of struggled with, which is sharply distinguishing “in the long run 95% of intervals constructed in this manner will contain the true value of the parameter” from “there is a 95% chance this particular interval contains the true value of the parameter.”

    I understand there are philosophical issues where a sort of pure frequentist doesn’t want to say that a parameter has a probability of doing anything because it’s a fixed value, but putting that aside for the moment (as I understand it the frequentist can instead talk about the probability that the interval covers the parameter since the interval is a random variable), my interpretation of “in the long run 95% of intervals constructed in this manner will contain the true value” would be that if the existence of this interval is my only piece of information about the true value, then my best estimate is that the probability it is in (or covered by) the interval is 0.95, since the probability that the interval is one of those 95% should be 0.95.

    I understand this is the exact view that’s always being explicitly forsworn, but I’ve never fully understood the issue with it, or what the alternative is. Is it that while 95% of such intervals will contain the true value, that doesn’t mean that this particular interval has a 95% chance of being one of those intervals? And if this interval is the only piece of information I have and I’m forced to assign a probability to it containing the true value is 95% not the best choice? Is there another better choice or somehow is there nothing I can say on the matter?

    • Michael:

      It can be true that on average 95% of the intervals contain the true value, without this being the case for any particular interval!

      Here’s a famous example. Construct a confidence interval for some unknown continuous parameter theta as follows. Choose a random integer between 1 and 100. If it is between 1 and 95, declare that your interval is (-infinity, +infinity). If it is between 96 and 100, declare that your interval is (0, 0). This procedure will have 95% coverage, but any particular interval will be known to include the parameter or not. The point is that there is information in the interval itself.

      The purely-random interval example is silly, but in real life any confidence interval will have some randomness (coming from the data), and the general principle holds, that unconditional coverage does not imply conditional coverage.

        • Anon:

          No, you’re missing the point! When I say the interval “includes some randomness (coming from the data),” I’m referring to the randomness that arises from the data themselves being random variables. Under the model, y is a random variable, so any confidence interval C(y) is also a random variable, even if C is a deterministic function of y.

        • Andrew:

          I only replied to this example CI construction:

          “Choose a random integer between 1 and 100. If it is between 1 and 95, declare that your interval is (-infinity, +infinity). If it is between 96 and 100, declare that your interval is (0, 0). This procedure will have 95% coverage, but any particular interval will be known to include the parameter or not. The point is that there is information in the interval itself.”

          This is not a valid CI construction under classical theory, since it’s not a deterministic function of your observations.

        • Anonymous’ remark is about the procedure where you “Choose a random integer between 1 and 100.” That’s not randomness coming from the data. The confidence interval thus constructed has nothing to do with any data.

        • Carlos:

          My point is that it doesn’t really matter if the randomness comes from an external supply or from the data; either way the confidence interval is a random variable. It’s fine that there are some examples where the randomness is external and other examples where the randomness is from the data. You can even get external randomness, in effect, from the data by doing something like taking the 99th and 100th places in the decimal expansion of one of the data points, or whatever.

        • > My point is that it doesn’t really matter if the randomness comes from an external supply or from the data; either way the confidence interval is a random variable

          I don’t see how that is related to whether it’s a requirement for a confidence interval to be a deterministic function of the observed data or not.

          Anonymous thinks that it is a requirement. It seems that you think it is not a requirement but your reply didn’t address that. (I have no particular opinion.)

    • Consider how those confidence intervals are usually calculated. I’ll ignore complexities. One calculates the sample mean and sample variance, then uses them to construct the confidence intervals, right? But the actual confidence intervals are based on the population mean and variance. We don’t know them, but only estimates based on our sample. And any finite sample we have will not include certain outliers (I’m thinking an unbounded distribution here), and without knowing about them we can’t accurately construct the confidence intervals. After all, the actual distribution might not be Gaussian in the tails, if we are assuming a Gaussian.

    • I’m with Michael here, and I’ve said it before. While the repeated sampling or long-run statements are indeed correct – and the interpretation of the single calculated confidence interval is not – I don’t see the value in repeatedly emphasizing this. It is good to highlight what the correct confidence interval interpretation is, but in practice when you have one sample what are you supposed to say about the calculated confidence interval? One option is to say nothing and say that the information you have is worthless. Another is to use the incorrect statement that you are 95% sure, certain, confident, probably correct, etc. that the calculated interval contains the true value. We can argue about the language and it indeed is important. It is probably better to use less confident terms, especially given that nonsampling errors are being ignored. But I think the insistence on the correct interpretation of the confidence interval is overstated. I know others on this blog disagree – but rather than just disagreeing, I’d like to see exactly what you would like someone to do with a confidence interval derived from a single sample. And if you want to say “nothing” then I’d like to hear exactly what kind of “evidence” you would use for the myriad policy questions faced in social science (e.g., effects of unemployment insurance, effects of tariffs, effects of vaccinations). I am not advocating against the need for replication and additional studies, but often there isn’t the time or resources to wait before decisions must be made.

      • I think the issue is that, in certain contexts, a Bayesian uncertainty interval and a valid confidence interval can diverge drastically. This paper has some fun examples involving a submarine: https://link.springer.com/article/10.3758/s13423-015-0947-8

        In a lot of the most common applied cases though, like estimating the location in a location-scale distribution, the confidence interval and a “naive” Bayesian model will agree. So agree that non-specialists should mostly not care at all. Non-statistician scientists should probably also not worry that much, but at least be vaguely aware of the issue.

  2. >> Bayesian intervals rest on prior beliefs

    > non-Bayesian uncertainty intervals are based on assumptions too!

    The intended message is, I think, that Bayesian intervals are based on those assumptions you mention and then more.

    It could be more clearly stated, using “distributions” instead of “beliefs” if you want, as Bayesian intervals rest on the assumption of prior distributions for the true value.”

  3. Andrew, I’d suggest a note of caution in recommending “blog your thoughts” to people. Something which works fine for you, as a well-established, senior, professor, may not be such a great idea for a person not so well-established and not so senior. For example, you can freely deride the work of prominent figures, and generally there isn’t anything too bad they can do to you. It’s not clear someone just starting out has the same privilege, and so there are quite a few risks you may not see yourself.

    Moreover, even if someone starts out writing about only technical topics, it’s very tempting to one day just sound off on politics or culture which the blogger feels strongly about. I’m currently reading a mess involving a blogger who is a computer scientist and writes about his expertise, but also other posts about what matters to him personally. Relevant here, he’s passionately pro-Zionist and pro-Israel and defends Israel’s actions in Gaza. This is going about as well for him as you can imagine. There’s nasty public arguments with other academics, stuff where someone’s likely to hold a grudge. Some activists on the other side have seemed to talk about (my perhaps biased summary) filing complaints against him as in my view a harassment tactic, though to be fair I doubt they’d get far – still, it’s disturbing. Not to mention he’s the target of a horde of trolls who get their kicks out of trying to emotionally hurt him, or just generally flinging crap at him and seeing if they can make anything stick.

    Given his status, and the current overall situation, he’ll probably be OK, at least formally. But if he didn’t have tenure, and the political winds were different, I suspect those posts would have a nontrivial chance of getting him fired. Now, it would be easy to say, “Don’t do that”. But it’s just not so simple. It’s like gambling. “Don’t risk more than you can afford to lose” is fine advice, but in the heat of the moment, many people make bad mistakes. One can say they shouldn’t make mistakes. However, talking about gambling as a fun pastime and perhaps a way to win big, would neglect some very common downsides.

    • Seth:

      Good point. I’m fortunate to have not started social media until I was about 40, with enough calmness not to shoot off on topics I know nothing about. Regarding my general advice: I recommend blogging rather than twitter, and I also recommend that people write on things connected to their personal experience, or when writing about things outside their experience, being clear that they’re speculating.

    • Seth what’s the point of freedom of speech if you don’t exercise it. Democracy thrives with a competition of ideas.

      I do think a pseudonym is the best middle ground. Publius, all of kierkegaard’s wacky names etc.

      I need to stop being lazy and start a blog.

      • I’m not sure using a pseudonym is a good idea.

        Many years ago, I used to comment actively on some blogs using a pseudonym. Over time my comments grew more and more extreme, and less and less supportable by strong facts and, where applicable, clear moral reasoning. Much of what I wrote then, I regret. Now, those were earlier times, and the internet was not then the malignant space it has now become. So I suffered no adverse consequences for my transgressions. It was only because I began to feel queasy about the questionable validity of the things I was saying that I finally stopped.

        I don’t think that, in the modern environment, a pseudonym affords much protection. I think it is not that hard for that veil to be pierced by determined adversaries. And to the extent that a pseudonym gives a false sense of security, if it has the effect that it had on my commentary, it actually increases the danger.

        Nowadays, I post in fewer spaces, and particularly avoid the more toxic ones. Wherever I post, I use my real name. That has the effect of restraining my comments to a more reasonable realm.

        • I agree that using one’s real name is important — it forces one to be more reflective and less impulsive. If I were dictator, I’d forbid all anonymous comments, or at the very least all comments without a stable anonymous identity.

          Also, using one’s real name allows people to contact and connect with you — I’ve found this to be a pleasant by-product of blogging. (Even though my blogging much sparser than Andrew’s…)

        • Clyde:

          I appreciate your personal experience.

          But this proves me point precisely! You had a transformative experience because you got to try these fuzzy ideas out on your own. Maybe you ruined a few discourses, but on the whole I think you came out a better human being because of it. I speculate you came to these conclusions because you were debating ideas on the merits and became honest with yourself.

          Also, my threat model doesn’t really include that determined of an adversary.

      • Sadly, not everyone can afford to exercise free speech. One little-discussed aspect of US labor law is that there is almost no employment protection for one’s views, in fact, almost the opposite. People can almost not even grasp that it’s different in other countries.

        Pseudonyms sound good in theory, but they aren’t much of a solution in practice. I’ve thought of that myself, I just don’t want to deal with it. You always have to worry about “operational security”, if some remark you make will lead to being unmasked. Moreover, it means you have a lot of problems referring to substantive work, i.e. if what you’re posting about comes from an investigation you did yourself. Want to go to conference? Maybe get a small grant to help with that work? More risk.

        Further, this doesn’t address the common scenario of starting out writing under your own name about your specific technical topic, and then feeling like you want to say something about Israel/Palestine or whatever. People who do that specifically want to address the readership they already have, relying on the relationship they’ve built.

        To be sure, it’s not impossible to be pseudonymous. But it’s very risky and constraining.

        democritus, I think you need to take into account how determined some adversaries can be, and one doesn’t need to be all that notable to suffer such ill-will. You might want to read part of this article. It’s long, but maybe jump to the “Scott Alexander” section:

        https://www.tracingwoodgrains.com/p/reliable-sources-how-wikipedia-admin

        “In February 2021, after Scott rearranged his life and quit his job in order to minimize the disruption from his name being revealed, …”

        It worked out for him. But that’s a pretty high price to pay for just being a blogger.

  4. Andrew, you mentioned needing to include nonsampling error when thinking about coverage. Is the paper you linked (Disentangling bias and variance in election polls) the best thing for me to read to understand how we can think about nonsampling error/how large we might expect it to be?

    I’m thinking about this in our work as I’m using MRP a lot but also wondering how to think substantively about what the total ‘error’/uncertainty is beyond just the statistical distributions based on model assumptions. Is ‘doubling the interval’ a rough but reasonable way of thinking about how much uncertainty there is likely to be?

  5. Not sure if it’s the same Nasir I know from graduate school; we did a master’s program with a heavy emphasis on the Bayesian tradition (he went on to a PhD at Cambridge IIRC and we lost touch). But if so, I’m glad he’s blogging. He was a sharp fellow and I’m looking forward to adding his blog to my list of readings. Thanks to Andrew for sharing.

  6. As model assumptions are arguably never true, there is no well defined “true value” of a parameter in reality either (as the parameter is defined within the “wrong though potentially useful” parametric model).

    Accepting this, we shouldn’t think that *in reality* the true value will be in the confidence interval (CI) 95% of times if lots of datasets are collected and lots of CIs are computed from them, as there is no such thing as a “true value” in reality.

    Instead what the CI does is this: It gives a set of parameter values of the assumed model such that the data look realistic enough to not distinguish reality from the assumed model with any parameter in the set regarding the statistic on which the CI is based (a certain model with a certain parameter value may look realistic or not depending on the statistic that is used for measuring “realism”). Obviously “realistic enough” is qualified by the level of the CI. This is about relating the model to the data, not about “true parameter values” in reality.

Leave a Reply

Your email address will not be published. Required fields are marked *